[jira] [Updated] (CASSANDRA-13780) ADD Node streaming throughput performance
[ https://issues.apache.org/jira/browse/CASSANDRA-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-13780: --- Fix Version/s: (was: 3.0.9) 3.0.x > ADD Node streaming throughput performance > - > > Key: CASSANDRA-13780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13780 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Linux 2.6.32-696.3.2.el6.x86_64 #1 SMP Mon Jun 19 > 11:55:55 PDT 2017 x86_64 x86_64 x86_64 GNU/Linux > Architecture: x86_64 > CPU op-mode(s):32-bit, 64-bit > Byte Order:Little Endian > CPU(s):40 > On-line CPU(s) list: 0-39 > Thread(s) per core:2 > Core(s) per socket:10 > Socket(s): 2 > NUMA node(s): 2 > Vendor ID: GenuineIntel > CPU family:6 > Model: 79 > Model name:Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz > Stepping: 1 > CPU MHz: 2199.869 > BogoMIPS: 4399.36 > Virtualization:VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 25600K > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > total used free sharedbuffers cached > Mem: 252G 217G34G 708K 308M 149G > -/+ buffers/cache:67G 185G > Swap: 16G 0B16G >Reporter: Kevin Rivait >Priority: Major > Fix For: 3.0.x > > > Problem: Adding a new node to a large cluster runs at least 1000x slower than > what the network and node hardware capacity can support, taking several days > per new node. Adjusting stream throughput and other YAML parameters seems to > have no effect on performance. Essentially, it appears that Cassandra has an > architecture scalability growth problem when adding new nodes to a moderate > to high data ingestion cluster because Cassandra cannot add new node capacity > fast enough to keep up with increasing data ingestion volumes and growth. > Initial Configuration: > Running 3.0.9 and have implemented TWCS on one of our largest table. > Largest table partitioned on (ID, MM) using 1 day buckets with a TTL of > 60 days. > Next release will change partitioning to (ID, MMDD) so that partitions > are aligned with daily TWCS buckets. > Each node is currently creating roughly a 30GB SSTable per day. > TWCS working as expected, daily SSTables are dropping off daily after 70 > days ( 60 + 10 day grace) > Current deployment is a 28 node 2 datacenter cluster, 14 nodes in each DC , > replication factor 3 > Data directories are backed with 4 - 2TB SSDs on each node and a 1 800GB SSD > for commit logs. > Requirement is to double cluster size, capacity, and ingestion volume within > a few weeks. > Observed Behavior: > 1. streaming throughput during add node – we observed maximum 6 Mb/s > streaming from each of the 14 nodes on a 20Gb/s switched network, taking at > least 106 hours for each node to join cluster and each node is only about 2.2 > TB is size. > 2. compaction on the newly added node - compaction has fallen behind, with > anywhere from 4,000 to 10,000 SSTables at any given time. It took 3 weeks > for compaction to finish on each newly added node. Increasing number of > compaction threads to match number of CPU (40) and increasing compaction > throughput to 32MB/s seemed to be the sweet spot. > 3. TWCS buckets on new node, data streamed to this node over 4 1/2 days. > Compaction correctly placed the data in daily files, but the problem is the > file dates reflect when compaction created the file and not the date of the > last record written in the TWCS bucket, which will cause the files to remain > around much longer than necessary. > Two Questions: > 1. What can be done to substantially improve the performance of adding a new > node? > 2. Can compaction on TWCS partitions for newly added nodes change the file > create date to match the highest date record in the file -or- add another > piece of meta-data to the TWCS files that reflect the file drop date so that > TWCS partitions can be dropped consistently? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13780) ADD Node streaming throughput performance
[ https://issues.apache.org/jira/browse/CASSANDRA-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13780: --- Priority: Major (was: Blocker) > ADD Node streaming throughput performance > - > > Key: CASSANDRA-13780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13780 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Linux 2.6.32-696.3.2.el6.x86_64 #1 SMP Mon Jun 19 > 11:55:55 PDT 2017 x86_64 x86_64 x86_64 GNU/Linux > Architecture: x86_64 > CPU op-mode(s):32-bit, 64-bit > Byte Order:Little Endian > CPU(s):40 > On-line CPU(s) list: 0-39 > Thread(s) per core:2 > Core(s) per socket:10 > Socket(s): 2 > NUMA node(s): 2 > Vendor ID: GenuineIntel > CPU family:6 > Model: 79 > Model name:Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz > Stepping: 1 > CPU MHz: 2199.869 > BogoMIPS: 4399.36 > Virtualization:VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 25600K > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > total used free sharedbuffers cached > Mem: 252G 217G34G 708K 308M 149G > -/+ buffers/cache:67G 185G > Swap: 16G 0B16G >Reporter: Kevin Rivait > Fix For: 3.0.9 > > > Problem: Adding a new node to a large cluster runs at least 1000x slower than > what the network and node hardware capacity can support, taking several days > per new node. Adjusting stream throughput and other YAML parameters seems to > have no effect on performance. Essentially, it appears that Cassandra has an > architecture scalability growth problem when adding new nodes to a moderate > to high data ingestion cluster because Cassandra cannot add new node capacity > fast enough to keep up with increasing data ingestion volumes and growth. > Initial Configuration: > Running 3.0.9 and have implemented TWCS on one of our largest table. > Largest table partitioned on (ID, MM) using 1 day buckets with a TTL of > 60 days. > Next release will change partitioning to (ID, MMDD) so that partitions > are aligned with daily TWCS buckets. > Each node is currently creating roughly a 30GB SSTable per day. > TWCS working as expected, daily SSTables are dropping off daily after 70 > days ( 60 + 10 day grace) > Current deployment is a 28 node 2 datacenter cluster, 14 nodes in each DC , > replication factor 3 > Data directories are backed with 4 - 2TB SSDs on each node and a 1 800GB SSD > for commit logs. > Requirement is to double cluster size, capacity, and ingestion volume within > a few weeks. > Observed Behavior: > 1. streaming throughput during add node – we observed maximum 6 Mb/s > streaming from each of the 14 nodes on a 20Gb/s switched network, taking at > least 106 hours for each node to join cluster and each node is only about 2.2 > TB is size. > 2. compaction on the newly added node - compaction has fallen behind, with > anywhere from 4,000 to 10,000 SSTables at any given time. It took 3 weeks > for compaction to finish on each newly added node. Increasing number of > compaction threads to match number of CPU (40) and increasing compaction > throughput to 32MB/s seemed to be the sweet spot. > 3. TWCS buckets on new node, data streamed to this node over 4 1/2 days. > Compaction correctly placed the data in daily files, but the problem is the > file dates reflect when compaction created the file and not the date of the > last record written in the TWCS bucket, which will cause the files to remain > around much longer than necessary. > Two Questions: > 1. What can be done to substantially improve the performance of adding a new > node? > 2. Can compaction on TWCS partitions for newly added nodes change the file > create date to match the highest date record in the file -or- add another > piece of meta-data to the TWCS files that reflect the file drop date so that > TWCS partitions can be dropped consistently? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org