from:"Minh Do \(JIRA\)"

[jira] [Commented] (CASSANDRA-8094) Heavy writes in RangeSlice read requests

2015-03-18 Thread Minh Do (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367387#comment-14367387
 ] 

Minh Do commented on CASSANDRA-8094:


Will have some time in the  couple of weeks to check it in. Thanks

 Heavy writes in RangeSlice read  requests 
 --

 Key: CASSANDRA-8094
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8094
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
 Fix For: 2.0.14


 RangeSlice requests always do a scheduled read repair when coordinators try 
 to resolve replicas' responses no matter read_repair_chance is set or not.
 Because of this, in low writes and high reads clusters, there are very high 
 write requests going on between nodes.  
 We should have an option to turn this off and this can be different than the 
 read_repair_chance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8751) C* should always listen to both ssl/non-ssl ports

2015-02-24 Thread Minh Do (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335507#comment-14335507
 ] 

Minh Do commented on CASSANDRA-8751:


TLS/SSL socket by design only processes secured or encrypted messages.  How can 
we use this one TLS/SSL socket to process both plain-text and encrypted 
messages simultaneously?  I don't think we can get away from this.

 C* should always listen to both ssl/non-ssl ports
 -

 Key: CASSANDRA-8751
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8751
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Critical

 Since there is always one thread dedicated on server socket listener and it 
 does not use much resource, we should always have these two listeners up no 
 matter what users set for internode_encryption.
 The reason behind this is that we need to switch back and forth between 
 different internode_encryption modes and we need C* servers to keep running 
 in transient state or during mode switching.  Currently this is not possible.
 For example, we have a internode_encryption=dc cluster in a multi-region AWS 
 environment and want to set internode_encryption=all by rolling restart C* 
 nodes.  However, the node with internode_encryption=all does not open to 
 listen to non-ssl port.  As a result, we have a splitted brain cluster here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8751) C* should always listen to both ssl/non-ssl ports

2015-02-05 Thread Minh Do (JIRA)

Minh Do created CASSANDRA-8751:
--

 Summary: C* should always listen to both ssl/non-ssl ports
 Key: CASSANDRA-8751
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8751
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Critical


Since there is always one thread dedicated on server socket listener and it 
does not use much resource, we should always have these two listeners up no 
matter what users set for internode_encryption.

The reason behind this is that we need to switch back and forth between 
different internode_encryption modes and we need C* servers to keep running in 
transient state or during mode switching.  Currently this is not possible.

For example, we have a internode_encryption=dc cluster in a multi-region AWS 
environment and want to set internode_encryption=all by rolling restart C* 
nodes.  However, the node with internode_encryption=all does not open to listen 
to non-ssl port.  As a result, we have a splitted brain cluster here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8094) Heavy writes in RangeSlice read requests

2014-12-02 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-8094:
---
Due Date: 15/Jan/15  (was: 14/Nov/14)

 Heavy writes in RangeSlice read  requests 
 --

 Key: CASSANDRA-8094
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8094
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
 Fix For: 2.0.12


 RangeSlice requests always do a scheduled read repair when coordinators try 
 to resolve replicas' responses no matter read_repair_chance is set or not.
 Because of this, in low writes and high reads clusters, there are very high 
 write requests going on between nodes.  
 We should have an option to turn this off and this can be different than the 
 read_repair_chance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

2014-12-02 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-8132:
---
Due Date: 15/Jan/15  (was: 28/Nov/14)

 Save or stream hints to a safe place in node replacement
 

 Key: CASSANDRA-8132
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8132
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
 Fix For: 2.1.3


 Often, we need to replace a node with a new instance in the cloud environment 
 where we have all nodes are still alive. To be safe without losing data, we 
 usually make sure all hints are gone before we do this operation.
 Replacement means we just want to shutdown C* process on a node and bring up 
 another instance to take over that node's token.
 However, if a node to be replaced has a lot of stored hints, its 
 HintedHandofManager seems very slow to send the hints to other nodes.  In our 
 case, we tried to replace a node and had to wait for several days before its 
 stored hints are clear out.  As mentioned above, we need all hints on this 
 node to clear out before we can terminate it and replace it by a new 
 instance/machine.
 Since this is not a decommission, I am proposing that we have the same 
 hints-streaming mechanism as in the decommission code.  Furthermore, there 
 needs to be a cmd for NodeTool to trigger this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

2014-10-16 Thread Minh Do (JIRA)

Minh Do created CASSANDRA-8132:
--

 Summary: Save or stream hints to a safe place in node replacement
 Key: CASSANDRA-8132
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8132
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
 Fix For: 2.1.1


Often, we need to replace a node with a new instance in the cloud environment 
where we have all nodes are still alive. To be safe without losing data, we 
usually make sure all hints are gone before we do this operation.

Replacement means we just want to shutdown C* process on a node and bring up 
another instance to take over that node's token.

However, if a node has a lot of stored hints, HintedHandofManager seems very 
slow to play the hints.  In our case, we tried to replace a node and had to 
wait for several days.

Since this is not a decommission, I am proposing that we have the same 
hints-streaming mechanism as in the decommission code.  Furthermore, there 
needs to be a cmd for NodeTool to trigger this.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

2014-10-16 Thread Minh Do (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Minh Do updated CASSANDRA-8132:
---
Description:
Often, we need to replace a node with a new instance in the cloud environment
where we have all nodes are still alive. To be safe without losing data, we
usually make sure all hints are gone before we do this operation.

Replacement means we just want to shutdown C* process on a node and bring up
another instance to take over that node's token.

However, if a node to be replaced has a lot of stored hints, its
HintedHandofManager seems very slow to send the hints to other nodes. In our
case, we tried to replace a node and had to wait for several days before its
stored hints are clear out. As mentioned above, we need all hints on this node
to clear out before we can terminate it and replace it by a new node.

Since this is not a decommission, I am proposing that we have the same
hints-streaming mechanism as in the decommission code. Furthermore, there
needs to be a cmd for NodeTool to trigger this.

was:
Often, we need to replace a node with a new instance in the cloud environment
where we have all nodes are still alive. To be safe without losing data, we
usually make sure all hints are gone before we do this operation.

Replacement means we just want to shutdown C* process on a node and bring up
another instance to take over that node's token.

However, if a node has a lot of stored hints, HintedHandofManager seems very
slow to play the hints. In our case, we tried to replace a node and had to
wait for several days.

Since this is not a decommission, I am proposing that we have the same
hints-streaming mechanism as in the decommission code. Furthermore, there
needs to be a cmd for NodeTool to trigger this.

Save or stream hints to a safe place in node replacement

Key: CASSANDRA-8132
URL: https://issues.apache.org/jira/browse/CASSANDRA-8132
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Minh Do
Assignee: Minh Do
Fix For: 2.1.1

Often, we need to replace a node with a new instance in the cloud environment
where we have all nodes are still alive. To be safe without losing data, we
usually make sure all hints are gone before we do this operation.
Replacement means we just want to shutdown C* process on a node and bring up
another instance to take over that node's token.
However, if a node to be replaced has a lot of stored hints, its
HintedHandofManager seems very slow to send the hints to other nodes. In our
case, we tried to replace a node and had to wait for several days before its
stored hints are clear out. As mentioned above, we need all hints on this
node to clear out before we can terminate it and replace it by a new node.
Since this is not a decommission, I am proposing that we have the same
hints-streaming mechanism as in the decommission code. Furthermore, there
needs to be a cmd for NodeTool to trigger this.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

2014-10-16 Thread Minh Do (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Replacement means we just want to shutdown C* process on a node and bring up
another instance to take over that node's token.

Since this is not a decommission, I am proposing that we have the same
hints-streaming mechanism as in the decommission code. Furthermore, there
needs to be a cmd for NodeTool to trigger this.

Replacement means we just want to shutdown C* process on a node and bring up
another instance to take over that node's token.

Since this is not a decommission, I am proposing that we have the same
hints-streaming mechanism as in the decommission code. Furthermore, there
needs to be a cmd for NodeTool to trigger this.

Save or stream hints to a safe place in node replacement

Key: CASSANDRA-8132
URL: https://issues.apache.org/jira/browse/CASSANDRA-8132
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Minh Do
Assignee: Minh Do
Fix For: 2.1.1

Often, we need to replace a node with a new instance in the cloud environment
where we have all nodes are still alive. To be safe without losing data, we
usually make sure all hints are gone before we do this operation.
Replacement means we just want to shutdown C* process on a node and bring up
another instance to take over that node's token.
However, if a node to be replaced has a lot of stored hints, its
HintedHandofManager seems very slow to send the hints to other nodes. In our
case, we tried to replace a node and had to wait for several days before its
stored hints are clear out. As mentioned above, we need all hints on this
node to clear out before we can terminate it and replace it by a new
instance/machine.
Since this is not a decommission, I am proposing that we have the same
hints-streaming mechanism as in the decommission code. Furthermore, there
needs to be a cmd for NodeTool to trigger this.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

2014-10-16 Thread Minh Do (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14174729#comment-14174729
]

Minh Do commented on CASSANDRA-8132:

Brandon, I mean it is the other way around to stream hints from the node about
to be replaced to one of its neighbors. It is just like in unbootstrap() that
we have to stream hints from the closest node prior to the shutdown.

We need to do this because we don't want to lose hints in shutting down a node
and replacing it with a new instance or machine.

Save or stream hints to a safe place in node replacement

Key: CASSANDRA-8132
URL: https://issues.apache.org/jira/browse/CASSANDRA-8132
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Minh Do
Assignee: Minh Do
Fix For: 2.1.1

Often, we need to replace a node with a new instance in the cloud environment
where we have all nodes are still alive. To be safe without losing data, we
usually make sure all hints are gone before we do this operation.
Replacement means we just want to shutdown C* process on a node and bring up
another instance to take over that node's token.
However, if a node to be replaced has a lot of stored hints, its
HintedHandofManager seems very slow to send the hints to other nodes. In our
case, we tried to replace a node and had to wait for several days before its
stored hints are clear out. As mentioned above, we need all hints on this
node to clear out before we can terminate it and replace it by a new
instance/machine.
Since this is not a decommission, I am proposing that we have the same
hints-streaming mechanism as in the decommission code. Furthermore, there
needs to be a cmd for NodeTool to trigger this.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8094) Heavy writes in RangeSlice read requests

2014-10-13 Thread Minh Do (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169774#comment-14169774
 ] 

Minh Do commented on CASSANDRA-8094:


@Jonathan, can we introduce another similar option like read_repair_chance per 
Column Family?

 Heavy writes in RangeSlice read  requests 
 --

 Key: CASSANDRA-8094
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8094
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
 Fix For: 2.0.11


 RangeSlice requests always do a scheduled read repair when coordinators try 
 to resolve replicas' responses no matter read_repair_chance is set or not.
 Because of this, in low writes and high reads clusters, there are very high 
 write requests going on between nodes.  
 We should have an option to turn this off and this can be different than the 
 read_repair_chance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8094) Heavy writes in RangeSlice read requests

2014-10-09 Thread Minh Do (JIRA)

Minh Do created CASSANDRA-8094:
--

 Summary: Heavy writes in RangeSlice read  requests 
 Key: CASSANDRA-8094
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8094
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
 Fix For: 2.0.11


RangeSlice requests always do a scheduler read repair when coordinators try to 
resolve replicats' responses no matter read_repair_chance is set or not.

Because of this, in low writes and high reads clusters, there are very high 
write requests going on between nodes.  

We should have an option to turn this off and this can be different than the 
read_repair_chance.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8094) Heavy writes in RangeSlice read requests

2014-10-09 Thread Minh Do (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Minh Do updated CASSANDRA-8094:
---
Description:
RangeSlice requests always do a scheduled read repair when coordinators try to
resolve replicas' responses no matter read_repair_chance is set or not.

Because of this, in low writes and high reads clusters, there are very high
write requests going on between nodes.

We should have an option to turn this off and this can be different than the
read_repair_chance.

was:
RangeSlice requests always do a scheduler read repair when coordinators try to
resolve replicats' responses no matter read_repair_chance is set or not.

Because of this, in low writes and high reads clusters, there are very high
write requests going on between nodes.

We should have an option to turn this off and this can be different than the
read_repair_chance.

Heavy writes in RangeSlice read requests
--

Key: CASSANDRA-8094
URL: https://issues.apache.org/jira/browse/CASSANDRA-8094
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Minh Do
Assignee: Minh Do
Fix For: 2.0.11

RangeSlice requests always do a scheduled read repair when coordinators try
to resolve replicas' responses no matter read_repair_chance is set or not.
Because of this, in low writes and high reads clusters, there are very high
write requests going on between nodes.
We should have an option to turn this off and this can be different than the
read_repair_chance.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-7818) Improve compaction logging

2014-09-08 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do reassigned CASSANDRA-7818:
--

Assignee: Minh Do

 Improve compaction logging
 --

 Key: CASSANDRA-7818
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7818
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Minh Do
Priority: Minor
  Labels: compaction, lhf
 Fix For: 2.1.1


 We should log more information about compactions to be able to debug issues 
 more efficiently
 * give each CompactionTask an id that we log (so that you can relate the 
 start-compaction-messages to the finished-compaction ones)
 * log what level the sstables are taken from



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6702) Upgrading node uses the wrong port in gossiping

2014-06-19 Thread Minh Do (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038266#comment-14038266
 ] 

Minh Do commented on CASSANDRA-6702:


If I recalled correctly, this happened on C* 1.2 nodes while the cluster was 
still in a mixed mode and the target nodes were seed nodes (C* 1.1.x).  After a 
while, gossips seemed to settle down correctly on the right IPs and Ports.  
However, this took some significant time depending on the size of the cluster.

 Upgrading node uses the wrong port in gossiping
 ---

 Key: CASSANDRA-6702
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6702
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 1.1.7, AWS, Ec2MultiRegionSnitch
Reporter: Minh Do
Priority: Minor
 Fix For: 1.2.17


 When upgrading a node in 1.1.7 (or 1.1.11) cluster to 1.2.15 and inspecting 
 the gossip information on port/Ip, I could see that the upgrading node (1.2 
 version) communicates to one other node in the same region using Public IP 
 and non-encrypted port.
 For the rest, the upgrading node uses the correct ports and IPs to 
 communicate in this manner:
Same region: private IP and non-encrypted port 
and
Different region: public IP and encrypted port
 Because there is one node like this (or 2 out of 12 nodes cluster in which 
 nodes are split equally on 2 AWS regions), we have to modify Security Group 
 to allow the new traffics.
 Without modifying the SG, the 95th and 99th latencies for both reads and 
 writes in the cluster are very bad (due to RPC timeout).  Inspecting closer, 
 that upgraded node (1.2 node) is contributing to all of the high latencies 
 whenever it acts as a coordinator node. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (CASSANDRA-6702) Upgrading node uses the wrong port in gossiping

2014-02-14 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-6702:
---

Description: 
When upgrading a node in 1.1.7 (or 1.1.11) cluster to 1.2.15 and inspecting the 
gossip information on port/Ip, I could see that the upgrading node (1.2 
version) communicates to one other node in the same region using Public IP and 
non-encrypted port.

For the rest, the upgrading node uses the correct ports and IPs to communicate 
in this manner:
   Same region: private IP and non-encrypted port 
   and
   Different region: public IP and encrypted port

Because there is one node like this (or 2 out of 12 nodes cluster in which 
nodes are split equally on 2 AWS regions), we have to modify Security Group to 
allow the new traffics.

Without modifying the SG, the 95th and 99th latencies for both reads and writes 
in the cluster are very bad (due to RPC timeout).  Inspecting closer, that 
upgraded node (1.2 node) is contributing to all of the high latencies whenever 
it acts as a coordinator node. 






 



  was:
When upgrading a node in 1.1.7 (or 1.1.11) cluster to 1.2.15 and inspecting the 
gossip information on port/Ip, I could see that the upgrading node (1.2 
version) communicates to one other node in the same region using Public IP and 
non-encrypted port.

For the rest, the upgrading node uses the correct ports and IPs to communicate 
in this manner:
   Same region: private IP and non-encrypted port 
   and
   Different region: public IP and encrypted port

Because there is one node like this (or probably 2 max), we have to modify 
Security Group to allow the new traffics.

Without modifying the SG, the 95th and 99th latencies for both reads and writes 
in the cluster are very bad (due to RPC timeout).  Inspecting closer, that 
upgraded node (1.2 node) is contributing to all of the high latencies whenever 
it acts as a coordinator node. 






 




 Upgrading node uses the wrong port in gossiping
 ---

 Key: CASSANDRA-6702
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6702
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 1.1.7, AWS, Ec2MultiRegionSnitch
Reporter: Minh Do
Priority: Minor
 Fix For: 1.2.16


 When upgrading a node in 1.1.7 (or 1.1.11) cluster to 1.2.15 and inspecting 
 the gossip information on port/Ip, I could see that the upgrading node (1.2 
 version) communicates to one other node in the same region using Public IP 
 and non-encrypted port.
 For the rest, the upgrading node uses the correct ports and IPs to 
 communicate in this manner:
Same region: private IP and non-encrypted port 
and
Different region: public IP and encrypted port
 Because there is one node like this (or 2 out of 12 nodes cluster in which 
 nodes are split equally on 2 AWS regions), we have to modify Security Group 
 to allow the new traffics.
 Without modifying the SG, the 95th and 99th latencies for both reads and 
 writes in the cluster are very bad (due to RPC timeout).  Inspecting closer, 
 that upgraded node (1.2 node) is contributing to all of the high latencies 
 whenever it acts as a coordinator node. 
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (CASSANDRA-6702) Upgrading node uses the wrong port in gossiping

2014-02-13 Thread Minh Do (JIRA)

Minh Do created CASSANDRA-6702:
--

 Summary: Upgrading node uses the wrong port in gossiping
 Key: CASSANDRA-6702
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6702
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 1.1.7, AWS, Ec2MultiRegionSnitch
Reporter: Minh Do
Priority: Minor
 Fix For: 1.2.15


When upgrading a node in 1.1.7 (or 1.1.11) cluster to 1.2.15 and inspecting the 
gossip information on port/Ip, I could see that the upgrading node (1.2 
version) communicates to one other node in the same region using Public IP and 
non-encrypted port.

For the rest, the upgrading node uses the correct ports and IPs to communicate 
in this manner:
   Same region: private IP and non-encrypted port 
   and
   Different region: public IP and encrypted port

Because there is one node like this (or probably 2 max), we have to modify 
Security Group to allow the new traffics.

Without modifying the SG, the 95th and 99th latencies for both reads and writes 
in the cluster are very bad (due to RPC timeout).  Inspecting closer, that 
upgraded node (1.2 node) is contributing to all of the high latencies whenever 
it acts as a coordinator node. 






 





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-5263) Allow Merkle tree maximum depth to be configurable

2014-02-03 Thread Minh Do (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890449#comment-13890449
]

Minh Do commented on CASSANDRA-5263:

If I understand correctly, are you saying that if N is the total number of rows
in all SSTables on a node for a given token range, then depth = logN with log
base 2? This works if a node does not hold too many rows. Can we safely
assume that a node does not hold more than 2^24 rows (or 16.7M rows)? Because
for this many rows, we need to build a Merkle tree with depth 24 and requires
about 1.6G of heap. Beyond this number, I would say we run into memory heap
allocation issue. I was thinking earlier that depth 20 is the maximum
allowable depth and I worked my way down to compute lower depth tree.

Allow Merkle tree maximum depth to be configurable
--

Key: CASSANDRA-5263
URL: https://issues.apache.org/jira/browse/CASSANDRA-5263
Project: Cassandra
Issue Type: Improvement
Components: Config
Affects Versions: 1.1.9
Reporter: Ahmed Bashir
Assignee: Minh Do

Currently, the maximum depth allowed for Merkle trees is hardcoded as 15.
This value should be configurable, just like phi_convict_treshold and other
properties.
Given a cluster with nodes responsible for a large number of row keys, Merkle
tree comparisons can result in a large amount of unnecessary row keys being
streamed.
Empirical testing indicates that reasonable changes to this depth (18, 20,
etc) don't affect the Merkle tree generation and differencing timings all
that much, and they can significantly reduce the amount of data being
streamed during repair.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-5263) Allow Merkle tree maximum depth to be configurable

2014-02-02 Thread Minh Do (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889277#comment-13889277
]

Minh Do commented on CASSANDRA-5263:

Using some generated sstable files within a token range, I ran a test on
building the Merkle tree at 20 depth and then add the computed hash values for
rows (69M added rows). These 2 steps together are equivalent to a validation
compaction process on a token range if I am not missing anything.

1. Tree building uses, on the average, 15-18% total CPU resources, and no I/O
2. SSTables scanning and row hash computation use, on the average, 10-12% total
CPU resources, and I/O resources limited by the configurable global
compaction rate limiter

Given the Jonathan's pointer on using SSTR.estimatedKeysForRanges() to
calculate number of rows for a SSTable file and no overlapping among SSTable
files (worst case), we can estimate how many data rows in a given token range.

From what I understood, here is the formula to calculate the Merkle tree's
depth (assuming each data row has a unique hash value):

1. If number of rows from all SSTables in a given range is approximately equal
to the maximum number of hash entries in that range (subject to a CF's
partitioner), thenvwe build the tree at 20 level depth (in the densest case)
2. When number of rows from all SSTables in a given range does not cover the
full hash range or in sparse case, we build a Merkle tree with a depth less
than 20. How do we come up with the right depth?
depth = 20 * (n rows / max rows)
where n is the total number of rows in all SSTables and max is the maximum
number of hash entries in that token range.

However, since different partitions give different max numbers, is there
anything we can assume to make it easy here like assuming all partitions would
have the same hash entries in a given token range?

Allow Merkle tree maximum depth to be configurable
--

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-28 Thread Minh Do (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884312#comment-13884312
]

Minh Do commented on CASSANDRA-6619:

Jonathan, you are right that both 1.2 and 1.1 are designed to read out the
versions from the headers of the other. However, 1.2, as a sender in opening
the outbound socket, expects to receive back immediately the version int as
soon as it sends out its own. 1.1, as a receiver, can read 1.2 header but does
not send the version int back.

Here is the piece of code in 1.2 in IncomingTcpConnection.java that sends back
the version int:

private void handleModernVersion(int version, int header) throws IOException
{
DataOutputStream out = new DataOutputStream(socket.getOutputStream());
out.writeInt(MessagingService.current_version);
out.flush();
..
}

Because 1.1 does not send this back immediately, OutboundTcpConnection will be
timed out on the read and the socket gets disconnected. The whole cycle will
get repeated again and again until some code sets the target version right. In
the lucky case, IncomingTcpConnection sets the right target version. However,
it takes a while for the other 1.1 nodes to know that there is a new 1.2 node,
especially if the new 1.2 node can't connection to any 1.1 nodes first.

Race condition issue during upgrading 1.1 to 1.2

Key: CASSANDRA-6619
URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Minor
Fix For: 1.2.14

Attachments: patch.txt

There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
One issue is that OutboundTCPConnection can't establish from a 1.2 node to
some 1.1x nodes. Because of this, a live cluster during the upgrading will
suffer in high read latency and be unable to fulfill some write requests. It
won't be a problem if there is a small cluster but it is a problem in a large
cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+
day(s) to complete.
Acknowledging about CASSANDRA-5692, however, it is not fully fixed. We
already have a patch for this and will attach shortly for feedback.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-28 Thread Minh Do (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884312#comment-13884312
 ] 

Minh Do edited comment on CASSANDRA-6619 at 1/28/14 4:41 PM:
-

Jonathan,  you are right that both 1.2 and 1.1 are designed to read out the 
versions from the headers of the other.  However, 1.2, as a sender in opening 
the outbound socket, expects to receive back immediately the version int as 
soon as it sends out its own.  1.1, as a receiver, can read 1.2 header but does 
not send the version int back.

Here is the piece of code in 1.2 in IncomingTcpConnection.java that sends back 
the version int:

private void handleModernVersion(int version, int header) throws IOException
{
DataOutputStream out = new DataOutputStream(socket.getOutputStream());
out.writeInt(MessagingService.current_version);
out.flush();
..
}


Because 1.1 does not send this back immediately, 1.2 OutboundTcpConnection will 
be timed out on the read and the socket gets disconnected.  The whole cycle 
will get repeated again and again until some code sets the target version 
right.  In the lucky case, IncomingTcpConnection sets the right target version. 
 However,
it takes a while for the other 1.1 nodes to know that there is a new 1.2 node, 
especially if the new 1.2 node can't connection to any 1.1 nodes first.





was (Author: timiblossom):
Jonathan,  you are right that both 1.2 and 1.1 are designed to read out the 
versions from the headers of the other.  However, 1.2, as a sender in opening 
the outbound socket, expects to receive back immediately the version int as 
soon as it sends out its own.  1.1, as a receiver, can read 1.2 header but does 
not send the version int back.

Here is the piece of code in 1.2 in IncomingTcpConnection.java that sends back 
the version int:

private void handleModernVersion(int version, int header) throws IOException
{
DataOutputStream out = new DataOutputStream(socket.getOutputStream());
out.writeInt(MessagingService.current_version);
out.flush();
..
}


Because 1.1 does not send this back immediately, OutboundTcpConnection will be 
timed out on the read and the socket gets disconnected.  The whole cycle will 
get repeated again and again until some code sets the target version right.  In 
the lucky case, IncomingTcpConnection sets the right target version.  However,
it takes a while for the other 1.1 nodes to know that there is a new 1.2 node, 
especially if the new 1.2 node can't connection to any 1.1 nodes first.




 Race condition issue during upgrading 1.1 to 1.2
 

 Key: CASSANDRA-6619
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Minor
 Fix For: 1.2.14

 Attachments: patch.txt


 There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
 One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
 some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
 suffer in high read latency and be unable to fulfill some write requests.  It 
 won't be a problem if there is a small cluster but it is a problem in a large 
 cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
 day(s) to complete.
 Acknowledging about CASSANDRA-5692, however, it does not fully fix the issue. 
  We already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-28 Thread Minh Do (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Minh Do updated CASSANDRA-6619:
---

Description:
There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
One issue is that OutboundTCPConnection can't establish from a 1.2 node to some
1.1x nodes. Because of this, a live cluster during the upgrading will suffer
in high read latency and be unable to fulfill some write requests. It won't be
a problem if there is a small cluster but it is a problem in a large cluster
(100+ nodes) because the upgrading process takes 10+ hours to 1+ day(s) to
complete.

Acknowledging about CASSANDRA-5692, however, it does not fully fix the issue.
We already have a patch for this and will attach shortly for feedback.

was:
There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
One issue is that OutboundTCPConnection can't establish from a 1.2 node to some
1.1x nodes. Because of this, a live cluster during the upgrading will suffer
in high read latency and be unable to fulfill some write requests. It won't be
a problem if there is a small cluster but it is a problem in a large cluster
(100+ nodes) because the upgrading process takes 10+ hours to 1+ day(s) to
complete.

Acknowledging about CASSANDRA-5692, however, it is not fully fixed. We already
have a patch for this and will attach shortly for feedback.

Race condition issue during upgrading 1.1 to 1.2

Key: CASSANDRA-6619
URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Minor
Fix For: 1.2.14

Attachments: patch.txt

There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
One issue is that OutboundTCPConnection can't establish from a 1.2 node to
some 1.1x nodes. Because of this, a live cluster during the upgrading will
suffer in high read latency and be unable to fulfill some write requests. It
won't be a problem if there is a small cluster but it is a problem in a large
cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+
day(s) to complete.
Acknowledging about CASSANDRA-5692, however, it does not fully fix the issue.
We already have a patch for this and will attach shortly for feedback.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-28 Thread Minh Do (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884312#comment-13884312
 ] 

Minh Do edited comment on CASSANDRA-6619 at 1/28/14 4:50 PM:
-

Jonathan,  you are right that both 1.2 and 1.1 are designed to read out the 
versions from the headers of the other.  However, 1.2, as a sender in opening 
the outbound socket, expects to receive back immediately the version int as 
soon as it sends out its own.  1.1, as a receiver, can read 1.2 header but does 
not send the version int back.

Here is the piece of code in 1.2 in IncomingTcpConnection.java that sends back 
the version int:

private void handleModernVersion(int version, int header) throws IOException
{
DataOutputStream out = new DataOutputStream(socket.getOutputStream());
out.writeInt(MessagingService.current_version);
out.flush();
..
}


Because 1.1 does not send this back immediately, 1.2 OutboundTcpConnection will 
be timed out on the read and the socket gets disconnected.  The whole cycle 
will get repeated again and again until some code sets the target version 
right.  In the lucky case, IncomingTcpConnection sets the right target version. 
 However,
it takes a while for the other 1.1 nodes to know that there is a new 1.2 node, 
especially if the new 1.2 node can't connection to any 1.1 nodes first.

The version convergence will eventually settle down.  However, in a large 
cluster, this would take some time causing some side effects during this time 
such as high read latencies, and more hints being stored.




was (Author: timiblossom):
Jonathan,  you are right that both 1.2 and 1.1 are designed to read out the 
versions from the headers of the other.  However, 1.2, as a sender in opening 
the outbound socket, expects to receive back immediately the version int as 
soon as it sends out its own.  1.1, as a receiver, can read 1.2 header but does 
not send the version int back.

Here is the piece of code in 1.2 in IncomingTcpConnection.java that sends back 
the version int:

private void handleModernVersion(int version, int header) throws IOException
{
DataOutputStream out = new DataOutputStream(socket.getOutputStream());
out.writeInt(MessagingService.current_version);
out.flush();
..
}


Because 1.1 does not send this back immediately, 1.2 OutboundTcpConnection will 
be timed out on the read and the socket gets disconnected.  The whole cycle 
will get repeated again and again until some code sets the target version 
right.  In the lucky case, IncomingTcpConnection sets the right target version. 
 However,
it takes a while for the other 1.1 nodes to know that there is a new 1.2 node, 
especially if the new 1.2 node can't connection to any 1.1 nodes first.




 Race condition issue during upgrading 1.1 to 1.2
 

 Key: CASSANDRA-6619
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Minor
 Fix For: 1.2.14

 Attachments: patch.txt


 There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
 One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
 some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
 suffer in high read latency and be unable to fulfill some write requests.  It 
 won't be a problem if there is a small cluster but it is a problem in a large 
 cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
 day(s) to complete.
 Acknowledging about CASSANDRA-5692, however, it does not fully fix the issue. 
  We already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-27 Thread Minh Do (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883419#comment-13883419
]

Minh Do commented on CASSANDRA-6619:

As posted in other tickets, 1.1 and 1.2 have different message protocols.
Hence, it is important to set the right target version when making outbound
connections rather than depending on the inbound connections to set a version
value. Thus, race condition in setting the version values is solved.

Attachment is the patch to make sure the code does that when an outbound
connection is open and an exchange for versioning information in the hankshake
fails.

As discussed with Jason Brown here at Netflix, we came up with a solution that
during the upgrade, the upgraded nodes have in the environment the variable
cassandra.prev_version = 5 (for 1.1.7 or 4 for 1.1) to help out the handshakes
in a mixed version cluster.

Once a cluster is fully upgraded to 1.2, cassadra.prev_version is removed from
all nodes' environment and a C* rolling restart across nodes is required. This
step ensures that the new patch won't penalize the 1.2 cluster where all
outbound connections are from 1.2 to 1.2.

Race condition issue during upgrading 1.1 to 1.2

Key: CASSANDRA-6619
URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Minor
Fix For: 1.2.14

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-27 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-6619:
---

Attachment: (was: diff)

 Race condition issue during upgrading 1.1 to 1.2
 

 Key: CASSANDRA-6619
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Minor
 Fix For: 1.2.14


 There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
 One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
 some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
 suffer in high read latency and be unable to fulfill some write requests.  It 
 won't be a problem if there is a small cluster but it is a problem in a large 
 cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
 day(s) to complete.
 Acknowledging about CASSANDRA-5692, however, it is not fully fixed.  We 
 already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-27 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-6619:
---

Attachment: diff

 Race condition issue during upgrading 1.1 to 1.2
 

 Key: CASSANDRA-6619
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Minor
 Fix For: 1.2.14


 There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
 One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
 some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
 suffer in high read latency and be unable to fulfill some write requests.  It 
 won't be a problem if there is a small cluster but it is a problem in a large 
 cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
 day(s) to complete.
 Acknowledging about CASSANDRA-5692, however, it is not fully fixed.  We 
 already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-27 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-6619:
---

Attachment: patch.txt

 Race condition issue during upgrading 1.1 to 1.2
 

 Key: CASSANDRA-6619
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Minor
 Fix For: 1.2.14

 Attachments: patch.txt


 There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
 One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
 some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
 suffer in high read latency and be unable to fulfill some write requests.  It 
 won't be a problem if there is a small cluster but it is a problem in a large 
 cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
 day(s) to complete.
 Acknowledging about CASSANDRA-5692, however, it is not fully fixed.  We 
 already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-27 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-6619:
---

Reviewer: Jason Brown

 Race condition issue during upgrading 1.1 to 1.2
 

 Key: CASSANDRA-6619
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Minor
 Fix For: 1.2.14

 Attachments: patch.txt


 There is a race condition during upgrading a C* 1.1x cluster to C* 1.2.
 One issue is that OutboundTCPConnection can't establish from a 1.2 node to 
 some 1.1x nodes.  Because of this, a live cluster during the upgrading will 
 suffer in high read latency and be unable to fulfill some write requests.  It 
 won't be a problem if there is a small cluster but it is a problem in a large 
 cluster (100+ nodes) because the upgrading process takes 10+ hours to 1+ 
 day(s) to complete.
 Acknowledging about CASSANDRA-5692, however, it is not fully fixed.  We 
 already have a patch for this and will attach shortly for feedback.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (CASSANDRA-6619) Race condition during upgrading 1.1 to 1.2

2014-01-25 Thread Minh Do (JIRA)

Minh Do created CASSANDRA-6619:
--

 Summary: Race condition during upgrading 1.1 to 1.2
 Key: CASSANDRA-6619
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Minor
 Fix For: 1.2.14


There was a race condition during upgrading a C* 1.1x cluster to C* 1.2.
One issue is that OutboundTCPConnection can't establish from a 1.2 node to some 
1.1x nodes.  Because of this, a live cluster during the upgrading will suffer 
in high read latency and be unable to fulfill some write requests.  It won't be 
a problem if there is a small cluster but it is a problem in a large cluster 
because the upgrading process takes 10+ hours to 1+ days to complete.

Acknowledging about CASSANDRA-5692, however, it is not fully fixed.  We already 
have a patch for this and will attach shortly to let the community to review.






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

2014-01-25 Thread Minh Do (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Minh Do updated CASSANDRA-6619:
---

Acknowledging about CASSANDRA-5692, however, it is not fully fixed. We already
have a patch for this and will attach shortly for feedback.

was:
There was a race condition during upgrading a C* 1.1x cluster to C* 1.2.
One issue is that OutboundTCPConnection can't establish from a 1.2 node to some
1.1x nodes. Because of this, a live cluster during the upgrading will suffer
in high read latency and be unable to fulfill some write requests. It won't be
a problem if there is a small cluster but it is a problem in a large cluster
because the upgrading process takes 10+ hours to 1+ days to complete.

Acknowledging about CASSANDRA-5692, however, it is not fully fixed. We already
have a patch for this and will attach shortly to let the community to review.

Summary: Race condition issue during upgrading 1.1 to 1.2 (was: Race
condition during upgrading 1.1 to 1.2)

Race condition issue during upgrading 1.1 to 1.2

Key: CASSANDRA-6619
URL: https://issues.apache.org/jira/browse/CASSANDRA-6619
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Minh Do
Assignee: Minh Do
Priority: Minor
Fix For: 1.2.14

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-5263) Allow Merkle tree maximum depth to be configurable

2014-01-13 Thread Minh Do (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869384#comment-13869384
]

Minh Do commented on CASSANDRA-5263:

I also don't see how we use stable stats to adjust the MerkleTree depth
automatically. We can estimate number of rows for each sstable but we don't
know how many rows in a given range (unless we assume input is always a full
range).

In term of memory usage, MerkleTree with depth 20 uses around 100Mb and
MerkleTree with depth 17 uses around 15Mb. Does the extra 100Mb hurt Cassandra
performance on some nodes on some cases if we go to this extreme case?

Also, if we use depth 20 and multithreaded version to build MerkleTree, it is
going to impact the response latency.

Some thoughts?

Allow Merkle tree maximum depth to be configurable
--

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Assigned] (CASSANDRA-5263) Allow Merkle tree maximum depth to be configurable

2013-11-19 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do reassigned CASSANDRA-5263:
--

Assignee: Minh Do

 Allow Merkle tree maximum depth to be configurable
 --

 Key: CASSANDRA-5263
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5263
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Affects Versions: 1.1.9
Reporter: Ahmed Bashir
Assignee: Minh Do

 Currently, the maximum depth allowed for Merkle trees is hardcoded as 15.  
 This value should be configurable, just like phi_convict_treshold and other 
 properties.
 Given a cluster with nodes responsible for a large number of row keys, Merkle 
 tree comparisons can result in a large amount of unnecessary row keys being 
 streamed.
 Empirical testing indicates that reasonable changes to this depth (18, 20, 
 etc) don't affect the Merkle tree generation and differencing timings all 
 that much, and they can significantly reduce the amount of data being 
 streamed during repair. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Assigned] (CASSANDRA-6323) Create new sstables in the highest possible level

2013-11-19 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do reassigned CASSANDRA-6323:
--

Assignee: Minh Do

 Create new sstables in the highest possible level
 -

 Key: CASSANDRA-6323
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6323
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Minh Do
Priority: Minor
  Labels: compaction
 Fix For: 2.0.3


 See PickLevelForMemTableOutput here: 
 https://code.google.com/p/leveldb/source/browse/db/version_set.cc#507
 (Moving from CASSANDRA-5936)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (CASSANDRA-6308) Thread leak caused in creating OutboundTcpConnectionPool

2013-11-06 Thread Minh Do (JIRA)

Minh Do created CASSANDRA-6308:
--

 Summary: Thread leak caused in creating OutboundTcpConnectionPool
 Key: CASSANDRA-6308
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6308
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Minh Do
Priority: Minor
 Fix For: 1.2.12, 2.0.3


We have seen in one of our large clusters that there are many 
OutboundTcpConnection threads having the same names.  From a thread dump, 
OutboundTcpConnection threads have accounted for the largest shares of the 
total threads (65%+) and kept growing.

Here is a portion of a grep output for threads in which names start with 
WRITE-:

WRITE-/10.28.131.195 daemon prio=10 tid=0x2aaac4022000 nid=0x2cb5 waiting 
on condition [0x2acfbacda000]
WRITE-/10.28.131.195 daemon prio=10 tid=0x2aaac42fe000 nid=0x2cb4 waiting 
on condition [0x2acfbacad000]
WRITE-/10.30.142.49 daemon prio=10 tid=0x4084 nid=0x2cb1 waiting 
on condition [0x2acfbac8]
WRITE-/10.6.222.233 daemon prio=10 tid=0x4083e000 nid=0x2cb0 waiting 
on condition [0x2acfbac53000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x4083b800 nid=0x2caf waiting 
on condition [0x2acfbac26000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40839800 nid=0x2cae waiting 
on condition [0x2acfbabf9000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40837800 nid=0x2cad waiting 
on condition [0x2acfbabcc000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x404a3800 nid=0x2cac waiting 
on condition [0x2acfbab9f000]
WRITE-/10.30.142.49 daemon prio=10 tid=0x404a1800 nid=0x2cab waiting 
on condition [0x2acfbab72000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x4049f800 nid=0x2caa waiting 
on condition [0x2acfbab45000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x4049e000 nid=0x2ca9 waiting 
on condition [0x2acfbab18000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x4049c800 nid=0x2ca8 waiting 
on condition [0x2acfbaaeb000]
WRITE-/10.157.10.134 daemon prio=10 tid=0x4049a800 nid=0x2ca7 waiting 
on condition [0x2acfbaabe000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40498800 nid=0x2ca6 waiting 
on condition [0x2acfbaa91000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40496800 nid=0x2ca5 waiting 
on condition [0x2acfbaa64000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40717800 nid=0x2ca4 waiting 
on condition [0x2acfbaa37000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40716000 nid=0x2ca3 waiting 
on condition [0x2acfbaa0a000]
WRITE-/10.30.146.195 daemon prio=10 tid=0x40714800 nid=0x2ca2 waiting 
on condition [0x2acfba9dd000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40712800 nid=0x2ca1 waiting 
on condition [0x2acfba9b]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40710800 nid=0x2ca0 waiting 
on condition [0x2acfba983000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x4070e800 nid=0x2c9f waiting 
on condition [0x2acfba956000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x4070d000 nid=0x2c9e waiting 
on condition [0x2acfba929000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x4070b800 nid=0x2c9d waiting 
on condition [0x2acfba8fc000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x4070a000 nid=0x2c9c waiting 
on condition [0x2acfba8cf000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40827000 nid=0x2c9b waiting 
on condition [0x2acfba8a2000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40825000 nid=0x2c9a waiting 
on condition [0x2acfba875000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x2aaac488e000 nid=0x2c99 waiting 
on condition [0x2acfba848000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40823000 nid=0x2c98 waiting 
on condition [0x2acfba81b000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40821800 nid=0x2c97 waiting 
on condition [0x2acfba7ee000]
WRITE-/10.30.146.195 daemon prio=10 tid=0x4081f000 nid=0x2c96 waiting 
on condition [0x2acfba7c1000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x4081d000 nid=0x2c95 waiting 
on condition [0x2acfba794000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x4081b000 nid=0x2c94 waiting 
on condition [0x2acfba767000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x2aaac488b000 nid=0x2c93 waiting 
on condition [0x2acfba73a000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x40819000 nid=0x2c92 waiting 
on condition [0x2acfba70d000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x407f9000 nid=0x2c91 waiting 
on condition [0x2acfba6e]
WRITE-/10.6.222.233 daemon prio=10 tid=0x407f7000 nid=0x2c90 waiting 
on condition [0x2acfba6b3000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x407f5000 nid=0x2c8f waiting 
on condition [0x2acfba686000]
WRITE-/10.6.222.233 daemon prio=10 tid=0x407f3000 nid=0x2c8d

[jira] [Updated] (CASSANDRA-6308) Thread leak caused in creating OutboundTcpConnectionPool

2013-11-06 Thread Minh Do (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Minh Do updated CASSANDRA-6308:
---

Attachment: patch.txt

 Thread leak caused in creating OutboundTcpConnectionPool
 

 Key: CASSANDRA-6308
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6308
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Minh Do
Priority: Minor
  Labels: leak, thread
 Fix For: 1.2.12

 Attachments: patch.txt


 We have seen in one of our large clusters that there are many 
 OutboundTcpConnection threads having the same names.  From a thread dump, 
 OutboundTcpConnection threads have accounted for the largest shares of the 
 total threads (65%+) and kept growing.
 Here is a portion of a grep output for threads in which names start with 
 WRITE-:
 WRITE-/10.28.131.195 daemon prio=10 tid=0x2aaac4022000 nid=0x2cb5 
 waiting on condition [0x2acfbacda000]
 WRITE-/10.28.131.195 daemon prio=10 tid=0x2aaac42fe000 nid=0x2cb4 
 waiting on condition [0x2acfbacad000]
 WRITE-/10.30.142.49 daemon prio=10 tid=0x4084 nid=0x2cb1 
 waiting on condition [0x2acfbac8]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x4083e000 nid=0x2cb0 
 waiting on condition [0x2acfbac53000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x4083b800 nid=0x2caf 
 waiting on condition [0x2acfbac26000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40839800 nid=0x2cae 
 waiting on condition [0x2acfbabf9000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40837800 nid=0x2cad 
 waiting on condition [0x2acfbabcc000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x404a3800 nid=0x2cac 
 waiting on condition [0x2acfbab9f000]
 WRITE-/10.30.142.49 daemon prio=10 tid=0x404a1800 nid=0x2cab 
 waiting on condition [0x2acfbab72000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x4049f800 nid=0x2caa 
 waiting on condition [0x2acfbab45000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x4049e000 nid=0x2ca9 
 waiting on condition [0x2acfbab18000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x4049c800 nid=0x2ca8 
 waiting on condition [0x2acfbaaeb000]
 WRITE-/10.157.10.134 daemon prio=10 tid=0x4049a800 nid=0x2ca7 
 waiting on condition [0x2acfbaabe000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40498800 nid=0x2ca6 
 waiting on condition [0x2acfbaa91000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40496800 nid=0x2ca5 
 waiting on condition [0x2acfbaa64000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40717800 nid=0x2ca4 
 waiting on condition [0x2acfbaa37000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40716000 nid=0x2ca3 
 waiting on condition [0x2acfbaa0a000]
 WRITE-/10.30.146.195 daemon prio=10 tid=0x40714800 nid=0x2ca2 
 waiting on condition [0x2acfba9dd000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40712800 nid=0x2ca1 
 waiting on condition [0x2acfba9b]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40710800 nid=0x2ca0 
 waiting on condition [0x2acfba983000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x4070e800 nid=0x2c9f 
 waiting on condition [0x2acfba956000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x4070d000 nid=0x2c9e 
 waiting on condition [0x2acfba929000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x4070b800 nid=0x2c9d 
 waiting on condition [0x2acfba8fc000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x4070a000 nid=0x2c9c 
 waiting on condition [0x2acfba8cf000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40827000 nid=0x2c9b 
 waiting on condition [0x2acfba8a2000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40825000 nid=0x2c9a 
 waiting on condition [0x2acfba875000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x2aaac488e000 nid=0x2c99 
 waiting on condition [0x2acfba848000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40823000 nid=0x2c98 
 waiting on condition [0x2acfba81b000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40821800 nid=0x2c97 
 waiting on condition [0x2acfba7ee000]
 WRITE-/10.30.146.195 daemon prio=10 tid=0x4081f000 nid=0x2c96 
 waiting on condition [0x2acfba7c1000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x4081d000 nid=0x2c95 
 waiting on condition [0x2acfba794000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x4081b000 nid=0x2c94 
 waiting on condition [0x2acfba767000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x2aaac488b000 nid=0x2c93 
 waiting on condition [0x2acfba73a000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x40819000 nid=0x2c92 
 waiting on condition [0x2acfba70d000]
 WRITE-/10.6.222.233 daemon prio=10 tid=0x407f9000 nid=0x2c91 
 waiting

[jira] [Commented] (CASSANDRA-5175) Unbounded (?) thread growth connecting to an removed node

2013-07-26 Thread Minh Do (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721091#comment-13721091
 ] 

Minh Do commented on CASSANDRA-5175:


Hi Vijay,

I am using your commit db8705294ba96fe2b746fea4f26a919538653ebd but I think the 
logic in this commit is not the same as the attached patch.  Please take a look.

if (m == CLOSE_SENTINEL)
 {
 disconnect();
+if (!isStopped)
+break;
 continue;
 }


I think it should be :

   if (isStopped)
   break;

Thanks.


 Unbounded (?) thread growth connecting to an removed node
 -

 Key: CASSANDRA-5175
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5175
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.8
 Environment: EC2, JDK 7u9, Ubuntu 12.04.1 LTS
Reporter: Janne Jalkanen
Assignee: Vijay
Priority: Minor
 Fix For: 1.1.10, 1.2.1

 Attachments: 0001-CASSANDRA-5175.patch


 The following lines started repeating every minute in the log file
 {noformat}
  INFO [GossipStage:1] 2013-01-19 19:35:43,929 Gossiper.java (line 831) 
 InetAddress /10.238.x.y is now dead.
  INFO [GossipStage:1] 2013-01-19 19:35:43,930 StorageService.java (line 1291) 
 Removing token 170141183460469231731687303715884105718 for /10.238.x.y
 {noformat}
 Also, I got about 3000 threads which all look like this:
 {noformat}
 Name: WRITE-/10.238.x.y
 State: WAITING on 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1bb65c0f
 Total blocked: 0  Total waited: 3
 Stack trace: 
  sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:104)
 {noformat}
 A new thread seems to be created every minute, and they never go away.
 The endpoint in question had been a part of the cluster weeks ago, and the 
 node exhibiting the thread growth was added yesterday.
 Anyway, assassinating the endpoint in question stopped thread growth (but 
 kept the existing threads running), so this isn't a huge issue.  But I don't 
 think the thread count is supposed to be increasing like this...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-8094) Heavy writes in RangeSlice read requests

[jira] [Commented] (CASSANDRA-8751) C* should always listen to both ssl/non-ssl ports

[jira] [Created] (CASSANDRA-8751) C* should always listen to both ssl/non-ssl ports

[jira] [Updated] (CASSANDRA-8094) Heavy writes in RangeSlice read requests

[jira] [Updated] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

[jira] [Created] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

[jira] [Updated] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

[jira] [Updated] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

[jira] [Commented] (CASSANDRA-8132) Save or stream hints to a safe place in node replacement

[jira] [Commented] (CASSANDRA-8094) Heavy writes in RangeSlice read requests

[jira] [Created] (CASSANDRA-8094) Heavy writes in RangeSlice read requests

[jira] [Updated] (CASSANDRA-8094) Heavy writes in RangeSlice read requests

[jira] [Assigned] (CASSANDRA-7818) Improve compaction logging

[jira] [Commented] (CASSANDRA-6702) Upgrading node uses the wrong port in gossiping

[jira] [Updated] (CASSANDRA-6702) Upgrading node uses the wrong port in gossiping

[jira] [Created] (CASSANDRA-6702) Upgrading node uses the wrong port in gossiping

[jira] [Commented] (CASSANDRA-5263) Allow Merkle tree maximum depth to be configurable

[jira] [Commented] (CASSANDRA-5263) Allow Merkle tree maximum depth to be configurable

[jira] [Commented] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

[jira] [Comment Edited] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

[jira] [Comment Edited] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

[jira] [Commented] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

[jira] [Created] (CASSANDRA-6619) Race condition during upgrading 1.1 to 1.2

[jira] [Updated] (CASSANDRA-6619) Race condition issue during upgrading 1.1 to 1.2

[jira] [Commented] (CASSANDRA-5263) Allow Merkle tree maximum depth to be configurable

[jira] [Assigned] (CASSANDRA-5263) Allow Merkle tree maximum depth to be configurable

[jira] [Assigned] (CASSANDRA-6323) Create new sstables in the highest possible level

[jira] [Created] (CASSANDRA-6308) Thread leak caused in creating OutboundTcpConnectionPool

[jira] [Updated] (CASSANDRA-6308) Thread leak caused in creating OutboundTcpConnectionPool

[jira] [Commented] (CASSANDRA-5175) Unbounded (?) thread growth connecting to an removed node

35 matches

Site Navigation

Mail list logo

Footer information