Re: [EXTERNAL] Cassandra cluster add new node slowly
> > I suspect the compactionthroughput has an influence on the new node > joining. The command nodetool | getcompactionthroughput says 'Current > compaction throughput: 32 MB/s’. I would say this guess is true, but maybe not the way you think: the more disk IO you use for compactions, the slower the stream will be (if joining machine is CPU / Disk IO bounded). When using vnodes, all the node will be sending data to this new node. Often joining node are sticking at 100 % CPU or disk usage (or both). This node is considered as an extra replica until it is up and is not read from, so it is not a big deal. Thus it is possible to tune compaction to be faster without risk while the node is joining, but it might have the opposite effect and slow down the streaming process as it will use more resources that are already a bottleneck during bootstrap. Be aware that when the node join you could have performance issues if compactions are taking too much resources or if you have too many sstables (the opposite situation were compaction was running too slow). It's good to find a balance between the streaming speed and what the node can cope with so when the node joins the ring it does it in a healthy state (acceptable state at least). Often I observed that the streaming speed is substantially slower at the end of the bootstrap as only a few (and eventually just one) nodes are sending the data while other nodes are done with the streaming. In this phase, compactions can catch up, the disk space used and the number of SSTable is greatly reduced, allowing the node to join in good conditions. C*heers, --- Alain Rodriguez - @arodream - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2018-01-04 6:00 GMT+00:00 Anthony Grasso <anthony.gra...@gmail.com>: > The speed at which compactions operate is also physically restricted by > the speed of the disk. If the disks used on the new node are HDDs, then > increasing the compaction throughput will be of little help. However, if > the disks on the new node are SSDs then increasing the compaction > throughput to at least 64MB/s should help speed up compactions. > > Regards, > Anthony > > On 4 January 2018 at 14:13, qf zhou <zhouqf2...@gmail.com> wrote: > >> The cassandra version is 3.0.9. >> >> I have changed the heap size (about 32G). Also, the streaming >> throughput is set 800MB/sec, and the streaming_socket_timeout_in_ms is >> default 8640. >> I suspect the compactionthroughput has an influence on the new node >> joining. The command nodetool | getcompactionthroughput says >> 'Current compaction throughput: 32 MB/s’. >> >> >> >> >> 在 2018年1月4日,上午4:59,Durity, Sean R <sean_r_dur...@homedepot.com> 写道: >> >> You don’t mention the version, but here are some general suggestions >> >> - 2 GB heap is very small for a node, especially with 1 TB+ of >> data. What is the physical RAM on the host? In general, you want ½ of >> physical RAM for the JVM. (Look in jvm.options or cassandra-env.sh) >> - You can change the streaming throughput from the existing >> nodes, if it looks like the new node can handle it. Look at nodetool >> setstreamthroughput. Default is 200 (MB/sec). >> - You might want to check for a streaming_socket_timeout_in_ms. >> This has changed over the versions. Some details are at: >> https://issues.apache.org/jira/browse/CASSANDRA-11839. 24 hours is good >> recommendation. >> - If your new node can’t compact fast enough to keep disk usage >> down, look at compactionthroughput on that node >> - nodetool netstats | grep –v “100%” is a good way to see what >> is happening/if anything is stuck. Newer versions give a bit more info on >> progress. >> - Don’t forget to run cleanup on existing nodes after the new >> nodes are added. >> >> >> >> Sean Durity >> *From:* qf zhou [mailto:zhouqf2...@gmail.com <zhouqf2...@gmail.com>] >> *Sent:* Tuesday, January 02, 2018 10:30 PM >> *To:* user@cassandra.apache.org >> *Subject:* [EXTERNAL] Cassandra cluster add new node slowly >> >> The cluster has 3 nodes, and the data in each node is about 1.2 T. I >> want to add two new nodes to expand the cluster. >> >> Following the instructions from the datastax website, ie, ( >> http://docs.datastax.com/en/archived/cassandra/3.x/cassand >> ra/operations/opsAddNodeToCluster.html >> <https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.datastax.com_en_archived_cassandra_3.x_cassandra_operations_opsAddNodeToCluster.html=DwMFoQ=MtgQEAMQGqekjTjiAhkudQ
Re: [EXTERNAL] Cassandra cluster add new node slowly
The speed at which compactions operate is also physically restricted by the speed of the disk. If the disks used on the new node are HDDs, then increasing the compaction throughput will be of little help. However, if the disks on the new node are SSDs then increasing the compaction throughput to at least 64MB/s should help speed up compactions. Regards, Anthony On 4 January 2018 at 14:13, qf zhou <zhouqf2...@gmail.com> wrote: > The cassandra version is 3.0.9. > > I have changed the heap size (about 32G). Also, the streaming > throughput is set 800MB/sec, and the streaming_socket_timeout_in_ms is > default 8640. > I suspect the compactionthroughput has an influence on the new node > joining. The command nodetool | getcompactionthroughput says 'Current > compaction throughput: 32 MB/s’. > > > > > 在 2018年1月4日,上午4:59,Durity, Sean R <sean_r_dur...@homedepot.com> 写道: > > You don’t mention the version, but here are some general suggestions > > - 2 GB heap is very small for a node, especially with 1 TB+ of > data. What is the physical RAM on the host? In general, you want ½ of > physical RAM for the JVM. (Look in jvm.options or cassandra-env.sh) > - You can change the streaming throughput from the existing > nodes, if it looks like the new node can handle it. Look at nodetool > setstreamthroughput. Default is 200 (MB/sec). > - You might want to check for a streaming_socket_timeout_in_ms. > This has changed over the versions. Some details are at: > https://issues.apache.org/jira/browse/CASSANDRA-11839. 24 hours is good > recommendation. > - If your new node can’t compact fast enough to keep disk usage > down, look at compactionthroughput on that node > - nodetool netstats | grep –v “100%” is a good way to see what > is happening/if anything is stuck. Newer versions give a bit more info on > progress. > - Don’t forget to run cleanup on existing nodes after the new > nodes are added. > > > > Sean Durity > *From:* qf zhou [mailto:zhouqf2...@gmail.com <zhouqf2...@gmail.com>] > *Sent:* Tuesday, January 02, 2018 10:30 PM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] Cassandra cluster add new node slowly > > The cluster has 3 nodes, and the data in each node is about 1.2 T. I > want to add two new nodes to expand the cluster. > > Following the instructions from the datastax website, ie, ( > http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/operations/ > opsAddNodeToCluster.html > <https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.datastax.com_en_archived_cassandra_3.x_cassandra_operations_opsAddNodeToCluster.html=DwMFoQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=AfP0BxfTG1o_-2nRtGSjngs45YndH060h39eVlypVso=AaiOypurMtWA470038fHMbaXJPIZeYtLzv3Yt9vf1Sc=>), > > > I try to add one node to the cluster. However, it is too slow and > time cost too much. After about 24 hours, it still didn’t success. > > I run the command: nodetool netstats on the new node, it shows that: > > (tb1fullwithstate2 is a big table and 90% of the cluster data is in > it. Here I use CompactionStrategy: TimeWindowCompactionStrategy). > > /*.*.*.3 > Receiving 136 files, 328573794609 bytes total. Already received 8 > files, 5774621188 bytes total > tb1/tb1fullwithneweststatetest 3758271/3758271 bytes(100%) > received from idx:0/*.*.*.3 > system_distributed/repair_history 57534/57534 bytes(100%) > received from idx:0/*.*.*.3 > system_distributed/parent_repair_history 507660/507660 > bytes(100%) received from idx:0/*.*.*.3 > tb1/tb1_device_last_state_eachday 15754096/15754096 > bytes(100%) received from idx:0/*.*.*.3 > mytest1/tb1_test1 8143775/8143775 bytes(100%) received from > idx:0/*.*.*.3 > tb1/tb1fullwithstate 2251191007/2251191007 bytes(100%) > received from idx:0/*.*.*.3 > applocationinfo/weiyirong_app 2760/2760 bytes(100%) received > from idx:0/*.*.*.3 > tb1/tb1fullwithstate2 3490748006/4909554503 bytes(71%) > received from idx:0/*.*.*.3 > tb1/tb1fullwithneweststate 4458079/4458079 bytes(100%) > received from idx:0/*.*.*.3 > /*.*.*.2 > Receiving 136 files, 336762487360 bytes total. Already received 3 > files, 5695770181 bytes total > system_distributed/repair_history 31684/31684 bytes(100%) > received from idx:0/*.*.*.2 > tb1/tb1fullwithstate 908260516/908260516 bytes(100%) received > from idx:0/*.*.*.2 > tb1/tb1fullwithstate2 4783622958/4990450588 bytes(95%) > received from idx:0/*.*.*.2 > tb1/tb1fullwithneweststate 3855023/3855023 bytes
Re: [EXTERNAL] Cassandra cluster add new node slowly
The cassandra version is 3.0.9. I have changed the heap size (about 32G). Also, the streaming throughput is set 800MB/sec, and the streaming_socket_timeout_in_ms is default 8640. I suspect the compactionthroughput has an influence on the new node joining. The command nodetool | getcompactionthroughput says 'Current compaction throughput: 32 MB/s’. > 在 2018年1月4日,上午4:59,Durity, Sean R <sean_r_dur...@homedepot.com> 写道: > > You don’t mention the version, but here are some general suggestions > > - 2 GB heap is very small for a node, especially with 1 TB+ of data. > What is the physical RAM on the host? In general, you want ½ of physical RAM > for the JVM. (Look in jvm.options or cassandra-env.sh) > - You can change the streaming throughput from the existing nodes, > if it looks like the new node can handle it. Look at nodetool > setstreamthroughput. Default is 200 (MB/sec). > - You might want to check for a streaming_socket_timeout_in_ms. This > has changed over the versions. Some details are at: > https://issues.apache.org/jira/browse/CASSANDRA-11839 > <https://issues.apache.org/jira/browse/CASSANDRA-11839>. 24 hours is good > recommendation. > - If your new node can’t compact fast enough to keep disk usage > down, look at compactionthroughput on that node > - nodetool netstats | grep –v “100%” is a good way to see what is > happening/if anything is stuck. Newer versions give a bit more info on > progress. > - Don’t forget to run cleanup on existing nodes after the new nodes > are added. > > > > Sean Durity > From: qf zhou [mailto:zhouqf2...@gmail.com] > Sent: Tuesday, January 02, 2018 10:30 PM > To: user@cassandra.apache.org > Subject: [EXTERNAL] Cassandra cluster add new node slowly > > The cluster has 3 nodes, and the data in each node is about 1.2 T. I > want to add two new nodes to expand the cluster. > > Following the instructions from the datastax website, ie, > (http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/operations/opsAddNodeToCluster.html > > <https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.datastax.com_en_archived_cassandra_3.x_cassandra_operations_opsAddNodeToCluster.html=DwMFoQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=AfP0BxfTG1o_-2nRtGSjngs45YndH060h39eVlypVso=AaiOypurMtWA470038fHMbaXJPIZeYtLzv3Yt9vf1Sc=>), > > > I try to add one node to the cluster. However, it is too slow and > time cost too much. After about 24 hours, it still didn’t success. > > I run the command: nodetool netstats on the new node, it shows that: > > (tb1fullwithstate2 is a big table and 90% of the cluster data is in it. > Here I use CompactionStrategy: TimeWindowCompactionStrategy). > > /*.*.*.3 > Receiving 136 files, 328573794609 bytes total. Already received 8 > files, 5774621188 bytes total > tb1/tb1fullwithneweststatetest 3758271/3758271 bytes(100%) > received from idx:0/*.*.*.3 > system_distributed/repair_history 57534/57534 bytes(100%) > received from idx:0/*.*.*.3 > system_distributed/parent_repair_history 507660/507660 > bytes(100%) received from idx:0/*.*.*.3 > tb1/tb1_device_last_state_eachday 15754096/15754096 bytes(100%) > received from idx:0/*.*.*.3 > mytest1/tb1_test1 8143775/8143775 bytes(100%) received from > idx:0/*.*.*.3 > tb1/tb1fullwithstate 2251191007/2251191007 bytes(100%) received > from idx:0/*.*.*.3 > applocationinfo/weiyirong_app 2760/2760 bytes(100%) received from > idx:0/*.*.*.3 > tb1/tb1fullwithstate2 3490748006/4909554503 bytes(71%) received > from idx:0/*.*.*.3 > tb1/tb1fullwithneweststate 4458079/4458079 bytes(100%) received > from idx:0/*.*.*.3 > /*.*.*.2 > Receiving 136 files, 336762487360 bytes total. Already received 3 > files, 5695770181 bytes total > system_distributed/repair_history 31684/31684 bytes(100%) > received from idx:0/*.*.*.2 > tb1/tb1fullwithstate 908260516/908260516 bytes(100%) received > from idx:0/*.*.*.2 > tb1/tb1fullwithstate2 4783622958/4990450588 bytes(95%) received > from idx:0/*.*.*.2 > tb1/tb1fullwithneweststate 3855023/3855023 bytes(100%) received > from idx:0/*.*.*.2 > /*.*.*.4 > Receiving 132 files, 236250553620 bytes total. Already received 10 > files, 3117465128 bytes total > mytest1/wordstest2 46/46 bytes(100%) received from idx:0/*.*.*.4 > tb1/tb1fullwithneweststatetest 3416891/3416891 bytes(100%) > received from idx:0/*.*.*.4 > system_distribute
RE: [EXTERNAL] Cassandra cluster add new node slowly
You don't mention the version, but here are some general suggestions - 2 GB heap is very small for a node, especially with 1 TB+ of data. What is the physical RAM on the host? In general, you want ½ of physical RAM for the JVM. (Look in jvm.options or cassandra-env.sh) - You can change the streaming throughput from the existing nodes, if it looks like the new node can handle it. Look at nodetool setstreamthroughput. Default is 200 (MB/sec). - You might want to check for a streaming_socket_timeout_in_ms. This has changed over the versions. Some details are at: https://issues.apache.org/jira/browse/CASSANDRA-11839. 24 hours is good recommendation. - If your new node can't compact fast enough to keep disk usage down, look at compactionthroughput on that node - nodetool netstats | grep -v "100%" is a good way to see what is happening/if anything is stuck. Newer versions give a bit more info on progress. - Don't forget to run cleanup on existing nodes after the new nodes are added. Sean Durity From: qf zhou [mailto:zhouqf2...@gmail.com] Sent: Tuesday, January 02, 2018 10:30 PM To: user@cassandra.apache.org Subject: [EXTERNAL] Cassandra cluster add new node slowly The cluster has 3 nodes, and the data in each node is about 1.2 T. I want to add two new nodes to expand the cluster. Following the instructions from the datastax website, ie, (http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/operations/opsAddNodeToCluster.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.datastax.com_en_archived_cassandra_3.x_cassandra_operations_opsAddNodeToCluster.html=DwMFoQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=AfP0BxfTG1o_-2nRtGSjngs45YndH060h39eVlypVso=AaiOypurMtWA470038fHMbaXJPIZeYtLzv3Yt9vf1Sc=>), I try to add one node to the cluster. However, it is too slow and time cost too much. After about 24 hours, it still didn't success. I run the command: nodetool netstats on the new node, it shows that: (tb1fullwithstate2 is a big table and 90% of the cluster data is in it. Here I use CompactionStrategy: TimeWindowCompactionStrategy). /*.*.*.3 Receiving 136 files, 328573794609 bytes total. Already received 8 files, 5774621188 bytes total tb1/tb1fullwithneweststatetest 3758271/3758271 bytes(100%) received from idx:0/*.*.*.3 system_distributed/repair_history 57534/57534 bytes(100%) received from idx:0/*.*.*.3 system_distributed/parent_repair_history 507660/507660 bytes(100%) received from idx:0/*.*.*.3 tb1/tb1_device_last_state_eachday 15754096/15754096 bytes(100%) received from idx:0/*.*.*.3 mytest1/tb1_test1 8143775/8143775 bytes(100%) received from idx:0/*.*.*.3 tb1/tb1fullwithstate 2251191007/2251191007 bytes(100%) received from idx:0/*.*.*.3 applocationinfo/weiyirong_app 2760/2760 bytes(100%) received from idx:0/*.*.*.3 tb1/tb1fullwithstate2 3490748006/4909554503 bytes(71%) received from idx:0/*.*.*.3 tb1/tb1fullwithneweststate 4458079/4458079 bytes(100%) received from idx:0/*.*.*.3 /*.*.*.2 Receiving 136 files, 336762487360 bytes total. Already received 3 files, 5695770181 bytes total system_distributed/repair_history 31684/31684 bytes(100%) received from idx:0/*.*.*.2 tb1/tb1fullwithstate 908260516/908260516 bytes(100%) received from idx:0/*.*.*.2 tb1/tb1fullwithstate2 4783622958/4990450588 bytes(95%) received from idx:0/*.*.*.2 tb1/tb1fullwithneweststate 3855023/3855023 bytes(100%) received from idx:0/*.*.*.2 /*.*.*.4 Receiving 132 files, 236250553620 bytes total. Already received 10 files, 3117465128 bytes total mytest1/wordstest2 46/46 bytes(100%) received from idx:0/*.*.*.4 tb1/tb1fullwithneweststatetest 3416891/3416891 bytes(100%) received from idx:0/*.*.*.4 system_distributed/repair_history 39720/39720 bytes(100%) received from idx:0/*.*.*.4 system_distributed/parent_repair_history 452250/452250 bytes(100%) received from idx:0/*.*.*.4 mytest1/weblogs 104/104 bytes(100%) received from idx:0/*.*.*.4 tb1/tb1_device_last_state_eachday 12670998/12670998 bytes(100%) received from idx:0/*.*.*.4 mytest1/tb1_test1 3257952/3257952 bytes(100%) received from idx:0/*.*.*.4 tb1/tb1fullwithstate 647702056/647702056 bytes(100%) received from idx:0/*.*.*.4 applocationinfo/weiyirong_app 3509/3509 bytes(100%) received from idx:0/*.*.*.4 tb1/tb1fullwithstate2 2446436305/3566562762 bytes(68%) received from idx:0/*.*.*.4 tb1/tb1fullwithneweststate 3485297/3485297 bytes(100%) received from idx:0/*.*.*.4 check the log in the logs/system.log, it shows that: INFO 06:09:33 Updating topology for /*.*.*.2 INFO 06:09:33 U