Thanks to you, I found a good tool called Reaper.(http://cassandra-reaper.io/ <http://cassandra-reaper.io/>)
I will try it. > On 4 Dec 2018, at 3:30 PM, Elliott Sims <elli...@backblaze.com> wrote: > > It depends on the type of repair, but you'll want to make sure all the data > is where it should be before running cleanup. Somewhat related, if you're > not running regular repairs already, you should be. You can do it via cron, > but I strongly suggest checking out Reaper. > > On Wed, Nov 28, 2018, 8:05 PM Eunsu Kim <eunsu.bil...@gmail.com > <mailto:eunsu.bil...@gmail.com> wrote: > Thank you for your response. > > I will run repair from datacenter2 with your advice. Do I have to run repair > on every node in datacenter2? > > There is no snapshot when checked with nodetool listsnaphosts. > > Thank you. > >> On 29 Nov 2018, at 4:31 AM, Elliott Sims <elli...@backblaze.com >> <mailto:elli...@backblaze.com>> wrote: >> >> I think you answered your own question, sort of. >> >> When you expand a cluster, it copies the appropriate rows to the new node(s) >> but doesn't automatically remove them from the old nodes. When you ran >> cleanup on datacenter1, it cleared out those old extra copies. I would >> suggest running a repair first for safety on datacenter2, then a "nodetool >> cleanup" on those hosts. >> >> Also run "nodetool snapshot" to make sure you don't have any old snapshots >> sitting around taking up space. >> >> On Wed, Nov 28, 2018 at 5:29 AM Eunsu Kim <eunsu.bil...@gmail.com >> <mailto:eunsu.bil...@gmail.com>> wrote: >> (I am sending the previous mail again because it seems that it has not been >> sent properly.) >> >> HI experts, >> >> I am running 2 datacenters each containing five nodes. (total 10 nodes, all >> 3.11.3) >> >> My data is stored one at each data center. (REPLICATION = { 'class' : >> 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'datacenter1': '1', >> 'datacenter2': '1’ }) >> >> Most of my data have a short TTL(14days). The gc_grace_seconds value for all >> tables is also 600sec. >> >> I expect the two data centers to use the same size but datacenter2 is using >> more size. It seems that the datas of datacenter2 is rarely deleted. While >> the disk usage for datacenter1 remains constant, the disk usage for >> datacenter2 continues to grow. >> >> —————— >> Datacenter: datacenter1 >> ======================= >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- Address Load Tokens Owns (effective) Host ID >> Rack >> UN 10.61.58.228 925.48 GiB 256 21.5% >> 60d1bac8-b4d6-4e02-a05f-badee0bb36f5 rack1 >> UN 10.61.58.167 840 GiB 256 20.0% >> a04fc77a-907f-490c-971c-4e1f964c7b14 rack1 >> UN 10.61.75.86 1.13 TiB 256 19.3% >> 618c101b-036d-42e7-bf9f-2bcbd429cbd1 rack1 >> UN 10.61.59.22 844.19 GiB 256 20.0% >> d8a4a165-13f0-4f4a-9278-4024730b8116 rack1 >> UN 10.61.59.82 737.88 GiB 256 19.2% >> 054a4eb5-6d1c-46fa-b550-34da610da4e0 rack1 >> Datacenter: datacenter2 >> ======================= >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- Address Load Tokens Owns (effective) Host ID >> Rack >> UN 10.42.6.120 1.11 TiB 256 18.6% >> 69f15be0-e5a1-474e-87cf-b063e6854402 rack1 >> UN 10.42.5.207 1.17 TiB 256 20.0% >> f78bdce5-cb01-47e0-90b9-fcc31568e49e rack1 >> UN 10.42.6.47 1.01 TiB 256 20.1% >> 3ff93b47-2c15-4e1a-a4ea-2596f26b4281 rack1 >> UN 10.42.6.48 1007.67 GiB 256 20.4% >> 8cbbe76d-6496-403a-8b09-fe6812c9dea2 rack1 >> UN 10.42.5.208 1.29 TiB 256 20.9% >> 4aa96c6a-6083-417f-a58a-ec847bcbfc7e rack1 >> ------------------ >> >> A few days ago, one node of datacenter1 broke down and replaced it, and I >> worked on rebuild, repair, and cleanup. >> >> >> What else can I do? >> >> Thank you in advance. >