[jira] [Commented] (CASSANDRA-13780) ADD Node streaming throughput performance

Kevin Rivait (JIRA) Mon, 28 Aug 2017 04:51:59 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143687#comment-16143687
 ]


Kevin Rivait commented on CASSANDRA-13780:
------------------------------------------

yes, we are using vnodes,  num_tokens: 128  on each node
when add a fifth node, we see 4 nodes stream to it
from the system.log
INFO  [main] 2017-08-24 14:16:56,071 StorageService.java:1170 - JOINING: 
Starting to bootstrap...
INFO  [main] 2017-08-24 14:16:56,187 StreamResultFuture.java:87 - [Stream 
#69767f90-88f8-11e7-aa33-f929dc1360c2] Executing streaming plan for Bootstrap
INFO  [StreamConnectionEstablisher:1] 2017-08-24 14:16:56,188 
StreamSession.java:239 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2] 
Starting streaming to /10.126.63.127
INFO  [StreamConnectionEstablisher:2] 2017-08-24 14:16:56,188 
StreamSession.java:239 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2] 
Starting streaming to /10.126.63.124
INFO  [StreamConnectionEstablisher:3] 2017-08-24 14:16:56,188 
StreamSession.java:239 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2] 
Starting streaming to /10.126.63.125
INFO  [StreamConnectionEstablisher:4] 2017-08-24 14:16:56,189 
StreamSession.java:239 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2] 
Starting streaming to /10.126.63.121
INFO  [StreamConnectionEstablisher:4] 2017-08-24 14:16:56,196 
StreamCoordinator.java:213 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2, 
ID#0] Beginning stream session with /10.126.63.121
INFO  [StreamConnectionEstablisher:3] 2017-08-24 14:16:56,196 
StreamCoordinator.java:213 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2, 
ID#0] Beginning stream session with /10.126.63.125
INFO  [StreamConnectionEstablisher:2] 2017-08-24 14:16:56,196 
StreamCoordinator.java:213 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2, 
ID#0] Beginning stream session with /10.126.63.124
INFO  [StreamConnectionEstablisher:1] 2017-08-24 14:16:56,196 
StreamCoordinator.java:213 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2, 
ID#0] Beginning stream session with /10.126.63.127
INFO  [STREAM-IN-/10.126.63.121] 2017-08-24 14:16:56,245 
StreamResultFuture.java:169 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2 
ID#0] Prepare completed. Receiving 9 files(1147643092 bytes), sending 0 files(0 
bytes)
INFO  [STREAM-IN-/10.126.63.127] 2017-08-24 14:16:56,245 
StreamResultFuture.java:169 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2 
ID#0] Prepare completed. Receiving 5 files(1354972399 bytes), sending 0 files(0 
bytes)
INFO  [STREAM-IN-/10.126.63.125] 2017-08-24 14:16:56,248 
StreamResultFuture.java:169 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2 
ID#0] Prepare completed. Receiving 9 files(1276409087 bytes), sending 0 files(0 
bytes)
INFO  [STREAM-IN-/10.126.63.124] 2017-08-24 14:16:56,249 
StreamResultFuture.java:169 - [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2 
ID#0] Prepare completed. Receiving 8 files(1446953252 bytes), sending 0 files(0 
bytes)
INFO  [StreamReceiveTask:1] 2017-08-24 14:22:28,495 StreamResultFuture.java:183 
- [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2] Session with /10.126.63.121 is 
complete
INFO  [StreamReceiveTask:1] 2017-08-24 14:23:09,001 StreamResultFuture.java:183 
- [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2] Session with /10.126.63.125 is 
complete
INFO  [StreamReceiveTask:1] 2017-08-24 14:23:27,289 StreamResultFuture.java:183 
- [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2] Session with /10.126.63.127 is 
complete
INFO  [StreamReceiveTask:1] 2017-08-24 14:23:58,065 StreamResultFuture.java:183 
- [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2] Session with /10.126.63.124 is 
complete
INFO  [StreamReceiveTask:1] 2017-08-24 14:23:58,068 StreamResultFuture.java:215 
- [Stream #69767f90-88f8-11e7-aa33-f929dc1360c2] All sessions completed

> ADD Node streaming throughput performance
> -----------------------------------------
>
>                 Key: CASSANDRA-13780
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13780
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: Linux 2.6.32-696.3.2.el6.x86_64 #1 SMP Mon Jun 19 
> 11:55:55 PDT 2017 x86_64 x86_64 x86_64 GNU/Linux
> Architecture:          x86_64
> CPU op-mode(s):        32-bit, 64-bit
> Byte Order:            Little Endian
> CPU(s):                40
> On-line CPU(s) list:   0-39
> Thread(s) per core:    2
> Core(s) per socket:    10
> Socket(s):             2
> NUMA node(s):          2
> Vendor ID:             GenuineIntel
> CPU family:            6
> Model:                 79
> Model name:            Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
> Stepping:              1
> CPU MHz:               2199.869
> BogoMIPS:              4399.36
> Virtualization:        VT-x
> L1d cache:             32K
> L1i cache:             32K
> L2 cache:              256K
> L3 cache:              25600K
> NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
> NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>              total       used       free     shared    buffers     cached
> Mem:          252G       217G        34G       708K       308M       149G
> -/+ buffers/cache:        67G       185G
> Swap:          16G         0B        16G
>            Reporter: Kevin Rivait
>             Fix For: 3.0.9
>
>
> Problem: Adding a new node to a large cluster runs at least 1000x slower than 
> what the network and node hardware capacity can support, taking several days 
> per new node.  Adjusting stream throughput and other YAML parameters seems to 
> have no effect on performance.  Essentially, it appears that Cassandra has an 
> architecture scalability growth problem when adding new nodes to a moderate 
> to high data ingestion cluster because Cassandra cannot add new node capacity 
> fast enough to keep up with increasing data ingestion volumes and growth.
> Initial Configuration: 
> Running 3.0.9 and have implemented TWCS on one of our largest table.
> Largest table partitioned on (ID, YYYYMM)  using 1 day buckets with a TTL of 
> 60 days.
> Next release will change partitioning to (ID, YYYYMMDD) so that partitions 
> are aligned with daily TWCS buckets.
> Each node is currently creating roughly a 30GB SSTable per day.
> TWCS working as expected,  daily SSTables are dropping off daily after 70 
> days ( 60 + 10 day grace)
> Current deployment is a 28 node 2 datacenter cluster, 14 nodes in each DC , 
> replication factor 3
> Data directories are backed with 4 - 2TB SSDs on each node  and a 1 800GB SSD 
> for commit logs.
> Requirement is to double cluster size, capacity, and ingestion volume within 
> a few weeks.
> Observed Behavior:
> 1. streaming throughput during add node – we observed maximum 6 Mb/s 
> streaming from each of the 14 nodes on a 20Gb/s switched network, taking at 
> least 106 hours for each node to join cluster and each node is only about 2.2 
> TB is size.
> 2. compaction on the newly added node - compaction has fallen behind, with 
> anywhere from 4,000 to 10,000 SSTables at any given time.  It took 3 weeks 
> for compaction to finish on each newly added node.   Increasing number of 
> compaction threads to match number of CPU (40)  and increasing compaction 
> throughput to 32MB/s seemed to be the sweet spot. 
> 3. TWCS buckets on new node, data streamed to this node over 4 1/2 days.  
> Compaction correctly placed the data in daily files, but the problem is the 
> file dates reflect when compaction created the file and not the date of the 
> last record written in the TWCS bucket, which will cause the files to remain 
> around much longer than necessary.  
> Two Questions:
> 1. What can be done to substantially improve the performance of adding a new 
> node?
> 2. Can compaction on TWCS partitions for newly added nodes change the file 
> create date to match the highest date record in the file -or- add another 
> piece of meta-data to the TWCS files that reflect the file drop date so that 
> TWCS partitions can be dropped consistently?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13780) ADD Node streaming throughput performance

Reply via email to