[jira] [Updated] (CASSANDRA-9059) read_repair slows down the response for mulit-dc setup with LOCAL_QUORUM read CL

2015-03-27 Thread Wei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zhu updated CASSANDRA-9059:
---
Summary: read_repair slows down the response for mulit-dc setup with 
LOCAL_QUORUM read CL  (was: read_repair slows down the response for mulit-dc 
setup )

> read_repair slows down the response for mulit-dc setup with LOCAL_QUORUM read 
> CL
> 
>
> Key: CASSANDRA-9059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: CentOS 6.6 , Cassandra 2.0.8
>Reporter: Wei Zhu
> Fix For: 2.0.14
>
> Attachments: coodidatorreadlatency_after.png, 
> coodidatorreadlatency_before.png
>
>
> We have two DC setup for the Cassandra Cluster. 7 nodes each, RF = 3. We 
> setup CL as LOCAL_QUORUM for both read and write. 
> We have noticed that 95th percentile coordiantorReadLatency is at 60ms range. 
> We modified read_repair_chance for this CF to 0.0 from the default 0.1, we 
> see dramatic improvements for the coordiantorReadLatency. It went down to 
> 10ms. 
> I suspect read_repair somehow waits/blocks the response from other DC which 
> shouldn't happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9059) read_repair slows down the response for mulit-dc setup

2015-03-27 Thread Wei Zhu (JIRA)
Wei Zhu created CASSANDRA-9059:
--

 Summary: read_repair slows down the response for mulit-dc setup 
 Key: CASSANDRA-9059
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9059
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: CentOS 6.6 , Cassandra 2.0.8
Reporter: Wei Zhu
 Attachments: coodidatorreadlatency_after.png, 
coodidatorreadlatency_before.png

We have two DC setup for the Cassandra Cluster. 7 nodes each, RF = 3. We setup 
CL as LOCAL_QUORUM for both read and write. 
We have noticed that 95th percentile coordiantorReadLatency is at 60ms range. 
We modified read_repair_chance for this CF to 0.0 from the default 0.1, we see 
dramatic improvements for the coordiantorReadLatency. It went down to 10ms. 
I suspect read_repair somehow waits/blocks the response from other DC which 
shouldn't happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message

2015-02-18 Thread Wei Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14326218#comment-14326218
 ] 

Wei Zhu commented on CASSANDRA-8819:


For us, the reason we set consistency level to be LOCAL_QUORUM is that we don't 
require (immediate) consistency in DC2, in our case is a backup DC. We are 
under the assumption that we can do whatever we want to DC2 as long as the 
consistency level is LOCAL_QUORUM. I totally agree that the non-local endpoints 
should be excluded for LOCAL_QUORUM even for bootstrapping nodes.

> LOCAL_QUORUM writes returns wrong message
> -
>
> Key: CASSANDRA-8819
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8819
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: CentOS 6.6
>Reporter: Wei Zhu
>Assignee: Tyler Hobbs
> Fix For: 2.0.13
>
>
> We have two DC3, each with 7 nodes.
> Here is the keyspace setup:
>  create keyspace test
>  with placement_strategy = 'NetworkTopologyStrategy'
>  and strategy_options = {DC2 : 3, DC1 : 3}
>  and durable_writes = true;
> We brought down two nodes in DC2 for maintenance. We only write to DC1 using 
> local_quroum (using datastax JavaClient)
> But we see this errors in the log:
> Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica 
> were required but only 3 acknowledged the write
> why does it say 4 replica were required? and Why would it give error back to 
> client since local_quorum should succeed.
> Here are the output from nodetool status
> Note: Ownership information does not include topology; for complete 
> information, specify a keyspace
> Datacenter: DC2
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address  Load   Tokens  Owns   Host ID
>Rack
> UN  10.2.0.1  10.92 GB   256 7.9%     RAC206
> UN  10.2.0.2   6.17 GB256 8.0%     RAC106
> UN  10.2.0.3  6.63 GB256 7.3%     RAC107
> DL  10.2.0.4  1.54 GB256 7.7%    RAC107
> UN  10.2.0.5  6.02 GB256 6.6%     RAC106
> UJ  10.2.0.6   3.68 GB256 ?    RAC205
> UN  10.2.0.7  7.22 GB256 7.7%    RAC205
> Datacenter: DC1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address  Load   Tokens  Owns   Host ID
>Rack
> UN  10.1.0.1   6.04 GB256 8.6%    RAC10
> UN  10.1.0.2   7.55 GB256 7.4%     RAC8
> UN  10.1.0.3   5.83 GB256 7.0%     RAC9
> UN  10.1.0.47.34 GB256 7.9%     RAC6
> UN  10.1.0.5   7.57 GB256 8.0%    RAC7
> UN  10.1.0.6   5.31 GB256 7.3%     RAC10
> UN  10.1.0.7   5.47 GB256 8.6%    RAC9
> I did a cql trace on the query and here is the trace, and it does say 
>Write timeout; received 3 of 4 required replies | 17:27:52,831 |  10.1.0.1 
> |2002873
> at the end. I guess that is where the client gets the error from. But the 
> rows was inserted to Cassandra correctly. And I traced read with local_quorum 
> and it behaves correctly and the reads don't go to DC2. The problem is only 
> with writes on local_quorum.
> {code}
> Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890
>  activity 
>| timestamp
> | source  | source_elapsed
> -+--+-+
>   
> execute_cql3_query | 17:27:50,828 
> |  10.1.0.1 |  0
>  Parsing insert into test (user_id, created, event_data, event_id)values ( 
> 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 
> 17:27:50,828 |  10.1.0.1 | 39
>   
>Preparing statement | 17:27:50,828 
> |  10.1.0.1 |135
>   
>  Message received from /10.1.0.1 | 17:27:50,829 | 
>  10.1.0.5 | 25
> 

[jira] [Updated] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message

2015-02-17 Thread Wei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zhu updated CASSANDRA-8819:
---
Description: 
We have two DC3, each with 7 nodes.
Here is the keyspace setup:

 create keyspace test
 with placement_strategy = 'NetworkTopologyStrategy'
 and strategy_options = {DC2 : 3, DC1 : 3}
 and durable_writes = true;

We brought down two nodes in DC2 for maintenance. We only write to DC1 using 
local_quroum (using datastax JavaClient)
But we see this errors in the log:
Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica 
were required but only 3 acknowledged the write
why does it say 4 replica were required? and Why would it give error back to 
client since local_quorum should succeed.

Here are the output from nodetool status

Note: Ownership information does not include topology; for complete 
information, specify a keyspace
Datacenter: DC2
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  Owns   Host ID  
 Rack
UN  10.2.0.1  10.92 GB   256 7.9%     RAC206
UN  10.2.0.2   6.17 GB256 8.0%     RAC106
UN  10.2.0.3  6.63 GB256 7.3%     RAC107
DL  10.2.0.4  1.54 GB256 7.7%    RAC107
UN  10.2.0.5  6.02 GB256 6.6%     RAC106
UJ  10.2.0.6   3.68 GB256 ?    RAC205
UN  10.2.0.7  7.22 GB256 7.7%    RAC205
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  Owns   Host ID  
 Rack
UN  10.1.0.1   6.04 GB256 8.6%    RAC10
UN  10.1.0.2   7.55 GB256 7.4%     RAC8
UN  10.1.0.3   5.83 GB256 7.0%     RAC9
UN  10.1.0.47.34 GB256 7.9%     RAC6
UN  10.1.0.5   7.57 GB256 8.0%    RAC7
UN  10.1.0.6   5.31 GB256 7.3%     RAC10
UN  10.1.0.7   5.47 GB256 8.6%    RAC9

I did a cql trace on the query and here is the trace, and it does say 
   Write timeout; received 3 of 4 required replies | 17:27:52,831 |  10.1.0.1 | 
   2002873

at the end. I guess that is where the client gets the error from. But the rows 
was inserted to Cassandra correctly. And I traced read with local_quorum and it 
behaves correctly and the reads don't go to DC2. The problem is only with 
writes on local_quorum.

Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890

 activity   
 | timestamp| 
source  | source_elapsed
-+--+-+

  execute_cql3_query | 17:27:50,828 |  
10.1.0.1 |  0
 Parsing insert into test (user_id, created, event_data, event_id)values ( 
123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 
|  10.1.0.1 | 39

 Preparing statement | 17:27:50,828 |  
10.1.0.1 |135

   Message received from /10.1.0.1 | 17:27:50,829 |  
10.1.0.5 | 25

  Sending message to /10.1.0.5 | 17:27:50,829 |  
10.1.0.1 |421

   Executing single-partition query on users | 17:27:50,829 |  
10.1.0.5 |177

Acquiring sstable references | 17:27:50,829 |  
10.1.0.5 |191

 Merging memtable tombstones | 17:27:50,830 |  
10.1.0.5 |208

   Message received from /10.1.0.5 | 17:27:50,830 |  
10.1.0.1 |   1461
   

[jira] [Updated] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message

2015-02-17 Thread Wei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zhu updated CASSANDRA-8819:
---
Reviewer: Philip Thompson

> LOCAL_QUORUM writes returns wrong message
> -
>
> Key: CASSANDRA-8819
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8819
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: CentOS 6.6
>Reporter: Wei Zhu
> Fix For: 2.0.8
>
>
> We have two DC3, each with 7 nodes.
> Here is the keyspace setup:
>  create keyspace test
>  with placement_strategy = 'NetworkTopologyStrategy'
>  and strategy_options = {DC2 : 3, DC1 : 3}
>  and durable_writes = true;
> We brought down two nodes in DC2 for maintenance. We only write to DC1 using 
> local_quroum (using datastax JavaClient)
> But we see this errors in the log:
> Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica 
> were required but only 3 acknowledged the write
> why does it say 4 replica were required? and Why would it give error back to 
> client since local_quorum should succeed.
> Here are the output from nodetool status
> Note: Ownership information does not include topology; for complete 
> information, specify a keyspace
> Datacenter: DC2
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address  Load   Tokens  Owns   Host ID
>Rack
> UN  10.2.0.1  10.92 GB   256 7.9%     RAC206
> UN  10.2.0.2   6.17 GB256 8.0%     RAC106
> UN  10.2.0.3  6.63 GB256 7.3%     RAC107
> DL  10.2.0.4  1.54 GB256 7.7%    RAC107
> UN  10.2.0.5  6.02 GB256 6.6%     RAC106
> UJ  10.2.0.6   3.68 GB256 ?    RAC205
> UN  10.2.0.7  7.22 GB256 7.7%    RAC205
> Datacenter: DC1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address  Load   Tokens  Owns   Host ID
>Rack
> UN  10.1.0.1   6.04 GB256 8.6%    RAC10
> UN  10.1.0.2   7.55 GB256 7.4%     RAC8
> UN  10.1.0.3   5.83 GB256 7.0%     RAC9
> UN  10.1.0.47.34 GB256 7.9%     RAC6
> UN  10.1.0.5   7.57 GB256 8.0%    RAC7
> UN  10.1.0.6   5.31 GB256 7.3%     RAC10
> UN  10.1.0.7   5.47 GB256 8.6%    RAC9
> I did a cql trace on the query and here is the trace, and it does say 
>Write timeout; received 3 of 4 required replies | 17:27:52,831 |  10.1.0.1 
> |2002873
> at the end. I guess that is where the client gets the error from.
> Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890
>  activity 
>| timestamp
> | source  | source_elapsed
> -+--+-+
>   
> execute_cql3_query | 17:27:50,828 
> |  10.1.0.1 |  0
>  Parsing insert into test (user_id, created, event_data, event_id)values ( 
> 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 
> 17:27:50,828 |  10.1.0.1 | 39
>   
>Preparing statement | 17:27:50,828 
> |  10.1.0.1 |135
>   
>  Message received from /10.1.0.1 | 17:27:50,829 | 
>  10.1.0.5 | 25
>   
> Sending message to /10.1.0.5 | 17:27:50,829 | 
>  10.1.0.1 |421
>   
>  Executing single-partition query on users | 17:27:50,829 
> |  10.1.0.5 |177
>   
>   Acquiring sstable references | 17:27:50,829 
> |  10.1.0.5 |191
>   
>

[jira] [Created] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message

2015-02-17 Thread Wei Zhu (JIRA)
Wei Zhu created CASSANDRA-8819:
--

 Summary: LOCAL_QUORUM writes returns wrong message
 Key: CASSANDRA-8819
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8819
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: CentOS 6.6
Reporter: Wei Zhu
 Fix For: 2.0.8


We have two DC3, each with 7 nodes.
Here is the keyspace setup:

 create keyspace test
 with placement_strategy = 'NetworkTopologyStrategy'
 and strategy_options = {DC2 : 3, DC1 : 3}
 and durable_writes = true;

We brought down two nodes in DC2 for maintenance. We only write to DC1 using 
local_quroum (using datastax JavaClient)
But we see this errors in the log:
Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica 
were required but only 3 acknowledged the write
why does it say 4 replica were required? and Why would it give error back to 
client since local_quorum should succeed.

Here are the output from nodetool status

Note: Ownership information does not include topology; for complete 
information, specify a keyspace
Datacenter: DC2
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  Owns   Host ID  
 Rack
UN  10.2.0.1  10.92 GB   256 7.9%     RAC206
UN  10.2.0.2   6.17 GB256 8.0%     RAC106
UN  10.2.0.3  6.63 GB256 7.3%     RAC107
DL  10.2.0.4  1.54 GB256 7.7%    RAC107
UN  10.2.0.5  6.02 GB256 6.6%     RAC106
UJ  10.2.0.6   3.68 GB256 ?    RAC205
UN  10.2.0.7  7.22 GB256 7.7%    RAC205
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  Owns   Host ID  
 Rack
UN  10.1.0.1   6.04 GB256 8.6%    RAC10
UN  10.1.0.2   7.55 GB256 7.4%     RAC8
UN  10.1.0.3   5.83 GB256 7.0%     RAC9
UN  10.1.0.47.34 GB256 7.9%     RAC6
UN  10.1.0.5   7.57 GB256 8.0%    RAC7
UN  10.1.0.6   5.31 GB256 7.3%     RAC10
UN  10.1.0.7   5.47 GB256 8.6%    RAC9

I did a cql trace on the query and here is the trace, and it does say 
   Write timeout; received 3 of 4 required replies | 17:27:52,831 |  10.1.0.1 | 
   2002873

at the end. I guess that is where the client gets the error from.

Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890

 activity   
 | timestamp| 
source  | source_elapsed
-+--+-+

  execute_cql3_query | 17:27:50,828 |  
10.1.0.1 |  0
 Parsing insert into test (user_id, created, event_data, event_id)values ( 
123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 
|  10.1.0.1 | 39

 Preparing statement | 17:27:50,828 |  
10.1.0.1 |135

   Message received from /10.1.0.1 | 17:27:50,829 |  
10.1.0.5 | 25

  Sending message to /10.1.0.5 | 17:27:50,829 |  
10.1.0.1 |421

   Executing single-partition query on users | 17:27:50,829 |  
10.1.0.5 |177

Acquiring sstable references | 17:27:50,829 |  
10.1.0.5 |191

 Merging memtable tombstones | 17:27:50,830 |  
10.1.0.5 |208

   Message received from /10.1.0.5 | 17:27:50,830 |  
10.1.0.1 |   1461
 

[jira] [Updated] (CASSANDRA-5342) ancestors are not cleared in SSTableMetadata after compactions are done and old SSTables are removed

2013-03-13 Thread Wei Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zhu updated CASSANDRA-5342:
---

Attachment: Screen Shot 2013-03-13 at 12.05.08 PM.png

> ancestors are not cleared in SSTableMetadata after compactions are done and 
> old SSTables are removed
> 
>
> Key: CASSANDRA-5342
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5342
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.1.10, 1.2.2
>Reporter: Wei Zhu
> Attachments: Screen Shot 2013-03-13 at 12.05.08 PM.png
>
>
> We are using LCS and have total of 38000 SSTables for one CF. During LCS, 
> there could be over a thousand SSTable involved. All those SSTable IDs are 
> stored in ancestors field of SSTableMetatdata for the new table. In our case, 
> it consumes more than 1G of heap memory for those field. Put it in 
> perspective, the ancestors consume 2 - 3 times more memory than bloomfilter 
> (fp = 0.1 by default) in LCS. 
> We should remove those ancestors from SSTableMetadata after the compaction is 
> finished and the old SSTable is removed. It  might be a big deal for Sized 
> Compaction since there are small number of SSTable involved. But it consumes 
> a lot of memory for LCS. 
> At least, we shouldn't load those ancestors to the memory during startup if 
> the files are removed. 
> I would love to contribute and provide patch. Please let me know how to 
> start. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-5342) ancestors are not cleared in SSTableMetadata after compactions are done and old SSTables are removed

2013-03-13 Thread Wei Zhu (JIRA)
Wei Zhu created CASSANDRA-5342:
--

 Summary: ancestors are not cleared in SSTableMetadata after 
compactions are done and old SSTables are removed
 Key: CASSANDRA-5342
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5342
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.2, 1.1.10
Reporter: Wei Zhu


We are using LCS and have total of 38000 SSTables for one CF. During LCS, there 
could be over a thousand SSTable involved. All those SSTable IDs are stored in 
ancestors field of SSTableMetatdata for the new table. In our case, it consumes 
more than 1G of heap memory for those field. Put it in perspective, the 
ancestors consume 2 - 3 times more memory than bloomfilter (fp = 0.1 by 
default) in LCS. 
We should remove those ancestors from SSTableMetadata after the compaction is 
finished and the old SSTable is removed. It  might be a big deal for Sized 
Compaction since there are small number of SSTable involved. But it consumes a 
lot of memory for LCS. 
At least, we shouldn't load those ancestors to the memory during startup if the 
files are removed. 
I would love to contribute and provide patch. Please let me know how to start. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira