[jira] [Updated] (CASSANDRA-9059) read_repair slows down the response for mulit-dc setup with LOCAL_QUORUM read CL
[ https://issues.apache.org/jira/browse/CASSANDRA-9059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zhu updated CASSANDRA-9059: --- Summary: read_repair slows down the response for mulit-dc setup with LOCAL_QUORUM read CL (was: read_repair slows down the response for mulit-dc setup ) > read_repair slows down the response for mulit-dc setup with LOCAL_QUORUM read > CL > > > Key: CASSANDRA-9059 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9059 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: CentOS 6.6 , Cassandra 2.0.8 >Reporter: Wei Zhu > Fix For: 2.0.14 > > Attachments: coodidatorreadlatency_after.png, > coodidatorreadlatency_before.png > > > We have two DC setup for the Cassandra Cluster. 7 nodes each, RF = 3. We > setup CL as LOCAL_QUORUM for both read and write. > We have noticed that 95th percentile coordiantorReadLatency is at 60ms range. > We modified read_repair_chance for this CF to 0.0 from the default 0.1, we > see dramatic improvements for the coordiantorReadLatency. It went down to > 10ms. > I suspect read_repair somehow waits/blocks the response from other DC which > shouldn't happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9059) read_repair slows down the response for mulit-dc setup
Wei Zhu created CASSANDRA-9059: -- Summary: read_repair slows down the response for mulit-dc setup Key: CASSANDRA-9059 URL: https://issues.apache.org/jira/browse/CASSANDRA-9059 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS 6.6 , Cassandra 2.0.8 Reporter: Wei Zhu Attachments: coodidatorreadlatency_after.png, coodidatorreadlatency_before.png We have two DC setup for the Cassandra Cluster. 7 nodes each, RF = 3. We setup CL as LOCAL_QUORUM for both read and write. We have noticed that 95th percentile coordiantorReadLatency is at 60ms range. We modified read_repair_chance for this CF to 0.0 from the default 0.1, we see dramatic improvements for the coordiantorReadLatency. It went down to 10ms. I suspect read_repair somehow waits/blocks the response from other DC which shouldn't happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14326218#comment-14326218 ] Wei Zhu commented on CASSANDRA-8819: For us, the reason we set consistency level to be LOCAL_QUORUM is that we don't require (immediate) consistency in DC2, in our case is a backup DC. We are under the assumption that we can do whatever we want to DC2 as long as the consistency level is LOCAL_QUORUM. I totally agree that the non-local endpoints should be excluded for LOCAL_QUORUM even for bootstrapping nodes. > LOCAL_QUORUM writes returns wrong message > - > > Key: CASSANDRA-8819 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8819 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: CentOS 6.6 >Reporter: Wei Zhu >Assignee: Tyler Hobbs > Fix For: 2.0.13 > > > We have two DC3, each with 7 nodes. > Here is the keyspace setup: > create keyspace test > with placement_strategy = 'NetworkTopologyStrategy' > and strategy_options = {DC2 : 3, DC1 : 3} > and durable_writes = true; > We brought down two nodes in DC2 for maintenance. We only write to DC1 using > local_quroum (using datastax JavaClient) > But we see this errors in the log: > Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica > were required but only 3 acknowledged the write > why does it say 4 replica were required? and Why would it give error back to > client since local_quorum should succeed. > Here are the output from nodetool status > Note: Ownership information does not include topology; for complete > information, specify a keyspace > Datacenter: DC2 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns Host ID >Rack > UN 10.2.0.1 10.92 GB 256 7.9% RAC206 > UN 10.2.0.2 6.17 GB256 8.0% RAC106 > UN 10.2.0.3 6.63 GB256 7.3% RAC107 > DL 10.2.0.4 1.54 GB256 7.7% RAC107 > UN 10.2.0.5 6.02 GB256 6.6% RAC106 > UJ 10.2.0.6 3.68 GB256 ? RAC205 > UN 10.2.0.7 7.22 GB256 7.7% RAC205 > Datacenter: DC1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns Host ID >Rack > UN 10.1.0.1 6.04 GB256 8.6% RAC10 > UN 10.1.0.2 7.55 GB256 7.4% RAC8 > UN 10.1.0.3 5.83 GB256 7.0% RAC9 > UN 10.1.0.47.34 GB256 7.9% RAC6 > UN 10.1.0.5 7.57 GB256 8.0% RAC7 > UN 10.1.0.6 5.31 GB256 7.3% RAC10 > UN 10.1.0.7 5.47 GB256 8.6% RAC9 > I did a cql trace on the query and here is the trace, and it does say >Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 > |2002873 > at the end. I guess that is where the client gets the error from. But the > rows was inserted to Cassandra correctly. And I traced read with local_quorum > and it behaves correctly and the reads don't go to DC2. The problem is only > with writes on local_quorum. > {code} > Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 > activity >| timestamp > | source | source_elapsed > -+--+-+ > > execute_cql3_query | 17:27:50,828 > | 10.1.0.1 | 0 > Parsing insert into test (user_id, created, event_data, event_id)values ( > 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | > 17:27:50,828 | 10.1.0.1 | 39 > >Preparing statement | 17:27:50,828 > | 10.1.0.1 |135 > > Message received from /10.1.0.1 | 17:27:50,829 | > 10.1.0.5 | 25 >
[jira] [Updated] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zhu updated CASSANDRA-8819: --- Description: We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 | 2002873 at the end. I guess that is where the client gets the error from. But the rows was inserted to Cassandra correctly. And I traced read with local_quorum and it behaves correctly and the reads don't go to DC2. The problem is only with writes on local_quorum. Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp| source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing statement | 17:27:50,828 | 10.1.0.1 |135 Message received from /10.1.0.1 | 17:27:50,829 | 10.1.0.5 | 25 Sending message to /10.1.0.5 | 17:27:50,829 | 10.1.0.1 |421 Executing single-partition query on users | 17:27:50,829 | 10.1.0.5 |177 Acquiring sstable references | 17:27:50,829 | 10.1.0.5 |191 Merging memtable tombstones | 17:27:50,830 | 10.1.0.5 |208 Message received from /10.1.0.5 | 17:27:50,830 | 10.1.0.1 | 1461
[jira] [Updated] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zhu updated CASSANDRA-8819: --- Reviewer: Philip Thompson > LOCAL_QUORUM writes returns wrong message > - > > Key: CASSANDRA-8819 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8819 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: CentOS 6.6 >Reporter: Wei Zhu > Fix For: 2.0.8 > > > We have two DC3, each with 7 nodes. > Here is the keyspace setup: > create keyspace test > with placement_strategy = 'NetworkTopologyStrategy' > and strategy_options = {DC2 : 3, DC1 : 3} > and durable_writes = true; > We brought down two nodes in DC2 for maintenance. We only write to DC1 using > local_quroum (using datastax JavaClient) > But we see this errors in the log: > Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica > were required but only 3 acknowledged the write > why does it say 4 replica were required? and Why would it give error back to > client since local_quorum should succeed. > Here are the output from nodetool status > Note: Ownership information does not include topology; for complete > information, specify a keyspace > Datacenter: DC2 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns Host ID >Rack > UN 10.2.0.1 10.92 GB 256 7.9% RAC206 > UN 10.2.0.2 6.17 GB256 8.0% RAC106 > UN 10.2.0.3 6.63 GB256 7.3% RAC107 > DL 10.2.0.4 1.54 GB256 7.7% RAC107 > UN 10.2.0.5 6.02 GB256 6.6% RAC106 > UJ 10.2.0.6 3.68 GB256 ? RAC205 > UN 10.2.0.7 7.22 GB256 7.7% RAC205 > Datacenter: DC1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns Host ID >Rack > UN 10.1.0.1 6.04 GB256 8.6% RAC10 > UN 10.1.0.2 7.55 GB256 7.4% RAC8 > UN 10.1.0.3 5.83 GB256 7.0% RAC9 > UN 10.1.0.47.34 GB256 7.9% RAC6 > UN 10.1.0.5 7.57 GB256 8.0% RAC7 > UN 10.1.0.6 5.31 GB256 7.3% RAC10 > UN 10.1.0.7 5.47 GB256 8.6% RAC9 > I did a cql trace on the query and here is the trace, and it does say >Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 > |2002873 > at the end. I guess that is where the client gets the error from. > Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 > activity >| timestamp > | source | source_elapsed > -+--+-+ > > execute_cql3_query | 17:27:50,828 > | 10.1.0.1 | 0 > Parsing insert into test (user_id, created, event_data, event_id)values ( > 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | > 17:27:50,828 | 10.1.0.1 | 39 > >Preparing statement | 17:27:50,828 > | 10.1.0.1 |135 > > Message received from /10.1.0.1 | 17:27:50,829 | > 10.1.0.5 | 25 > > Sending message to /10.1.0.5 | 17:27:50,829 | > 10.1.0.1 |421 > > Executing single-partition query on users | 17:27:50,829 > | 10.1.0.5 |177 > > Acquiring sstable references | 17:27:50,829 > | 10.1.0.5 |191 > >
[jira] [Created] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
Wei Zhu created CASSANDRA-8819: -- Summary: LOCAL_QUORUM writes returns wrong message Key: CASSANDRA-8819 URL: https://issues.apache.org/jira/browse/CASSANDRA-8819 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS 6.6 Reporter: Wei Zhu Fix For: 2.0.8 We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 | 2002873 at the end. I guess that is where the client gets the error from. Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp| source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing statement | 17:27:50,828 | 10.1.0.1 |135 Message received from /10.1.0.1 | 17:27:50,829 | 10.1.0.5 | 25 Sending message to /10.1.0.5 | 17:27:50,829 | 10.1.0.1 |421 Executing single-partition query on users | 17:27:50,829 | 10.1.0.5 |177 Acquiring sstable references | 17:27:50,829 | 10.1.0.5 |191 Merging memtable tombstones | 17:27:50,830 | 10.1.0.5 |208 Message received from /10.1.0.5 | 17:27:50,830 | 10.1.0.1 | 1461
[jira] [Updated] (CASSANDRA-5342) ancestors are not cleared in SSTableMetadata after compactions are done and old SSTables are removed
[ https://issues.apache.org/jira/browse/CASSANDRA-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zhu updated CASSANDRA-5342: --- Attachment: Screen Shot 2013-03-13 at 12.05.08 PM.png > ancestors are not cleared in SSTableMetadata after compactions are done and > old SSTables are removed > > > Key: CASSANDRA-5342 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5342 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.1.10, 1.2.2 >Reporter: Wei Zhu > Attachments: Screen Shot 2013-03-13 at 12.05.08 PM.png > > > We are using LCS and have total of 38000 SSTables for one CF. During LCS, > there could be over a thousand SSTable involved. All those SSTable IDs are > stored in ancestors field of SSTableMetatdata for the new table. In our case, > it consumes more than 1G of heap memory for those field. Put it in > perspective, the ancestors consume 2 - 3 times more memory than bloomfilter > (fp = 0.1 by default) in LCS. > We should remove those ancestors from SSTableMetadata after the compaction is > finished and the old SSTable is removed. It might be a big deal for Sized > Compaction since there are small number of SSTable involved. But it consumes > a lot of memory for LCS. > At least, we shouldn't load those ancestors to the memory during startup if > the files are removed. > I would love to contribute and provide patch. Please let me know how to > start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-5342) ancestors are not cleared in SSTableMetadata after compactions are done and old SSTables are removed
Wei Zhu created CASSANDRA-5342: -- Summary: ancestors are not cleared in SSTableMetadata after compactions are done and old SSTables are removed Key: CASSANDRA-5342 URL: https://issues.apache.org/jira/browse/CASSANDRA-5342 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.2, 1.1.10 Reporter: Wei Zhu We are using LCS and have total of 38000 SSTables for one CF. During LCS, there could be over a thousand SSTable involved. All those SSTable IDs are stored in ancestors field of SSTableMetatdata for the new table. In our case, it consumes more than 1G of heap memory for those field. Put it in perspective, the ancestors consume 2 - 3 times more memory than bloomfilter (fp = 0.1 by default) in LCS. We should remove those ancestors from SSTableMetadata after the compaction is finished and the old SSTable is removed. It might be a big deal for Sized Compaction since there are small number of SSTable involved. But it consumes a lot of memory for LCS. At least, we shouldn't load those ancestors to the memory during startup if the files are removed. I would love to contribute and provide patch. Please let me know how to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira