There are not that many tables but just number of splits. There are 25b entries but each entry is large. Is there an optimal tserver memory/heap usage to number of tablets relationship? I saw some references like https://www.oreilly.com/library/view/accumulo/9781491947098/ch10.html that states that you should keep 1k tablets per server but I think think that is over kill in our situation. Each tserver is quite large 16 core, 128GB.
On the tablet.suspend.duration setting, once I update that setting, do I need to restart master? After updating the setting, I saw in master log had old value (0s), but if I restart master it shows correct value..in my testing it didn’t make any difference, but am just curious. -S From: dev1 <d...@etcoleman.com> Sent: Tuesday, November 30, 2021 9:17 AM To: 'user@accumulo.apache.org' <user@accumulo.apache.org> Subject: RE: [EXTERNAL EMAIL] - Re: accumulo tserver rolling restart One thing that you might be able to optimize is the number of tablets per server – you stated that you have “roughly 4k+ tablets per tserver” Is that driven by the number of tables, or do you have lots of splits for a much smaller number of tables? From: Shailesh Ligade <slig...@fbi.gov<mailto:slig...@fbi.gov>> Sent: Monday, November 29, 2021 11:17 AM To: user@accumulo.apache.org<mailto:user@accumulo.apache.org> Subject: Re: [EXTERNAL EMAIL] - Re: accumulo tserver rolling restart Uhmm updated the setting tablet.suspended.duration to 5m config -s tablet.suspended.duration=5m but when i issued restart tserver (one at a time without waiting for first to come up), i still get all tablets unassigned 🙁 may be, I need to bring masters down first? btw this is for accumulo 1.10.0 am I missing anything? -S ________________________________ From: Shailesh Ligade <slig...@fbi.gov<mailto:slig...@fbi.gov>> Sent: Monday, November 29, 2021 10:35 AM To: user@accumulo.apache.org<mailto:user@accumulo.apache.org> <user@accumulo.apache.org<mailto:user@accumulo.apache.org>> Subject: Re: [EXTERNAL EMAIL] - Re: accumulo tserver rolling restart Thanks Michael, stop cluster using admin stop? The issue is that, since we are using systemd with restart=always, it interferes with any of those stop (stop-all, stop-here etc) commands/scripts. So either we have to modify systemd settings or may be just shutdown vm type of operation (i think that is little brutal) -S ________________________________ From: Michael Wall <mjw...@gmail.com<mailto:mjw...@gmail.com>> Sent: Monday, November 29, 2021 9:54 AM To: user@accumulo.apache.org<mailto:user@accumulo.apache.org> <user@accumulo.apache.org<mailto:user@accumulo.apache.org>> Subject: [EXTERNAL EMAIL] - Re: accumulo tserver rolling restart Is there a reason to not just stop the cluster, reset the heap and restart the cluster? That is simpler. On Mon, Nov 29, 2021 at 9:37 AM dev1 <d...@etcoleman.com<mailto:d...@etcoleman.com>> wrote: Yes – and don’t forget to reset it back when you are done. From: Ligade, Shailesh [USA] <ligade_shail...@bah.com<mailto:ligade_shail...@bah.com>> Sent: Monday, November 29, 2021 9:36 AM To: user@accumulo.apache.org<mailto:user@accumulo.apache.org> Subject: RE: accumulo tserver rolling restart Thanks, I am assuming I can set that property using shell and it will take effect immediately? Thanks -S From: dev1 <d...@etcoleman.com<mailto:d...@etcoleman.com>> Sent: Monday, November 29, 2021 9:25 AM To: 'user@accumulo.apache.org<mailto:user@accumulo.apache.org>' <user@accumulo.apache.org<mailto:user@accumulo.apache.org>> Subject: [External] RE: accumulo tserver rolling restart See https://accumulo.apache.org/1.10/accumulo_user_manual.html#_restarting_process_on_a_node<https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2Faccumulo.apache.org%2F1.10%2Faccumulo_user_manual.html*_restarting_process_on_a_node__%3BIw!!May37g!evyseDphy3PM_d8-tSlk89Sw1fFlSXHtH7vhiQedtcADc_P7OLEHw2kVZjlQ4Q8G_Q%24&data=04%7C01%7CSLIGADE%40FBI.GOV%7C979350c787894f72cca908d9b40c28db%7C022914a9b95f4b7bbace551ce1a04071%7C0%7C0%7C637738787912893850%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=mP9HEKWVGtNiNdJEcIevM%2BBUkZn24WORSmY3wjXSn8Q%3D&reserved=0> – A note on rolling restarts. There is property that can be set (table.suspend.duration) that will delay the reassignment while a tserver is restarting – there is a trade-off on the data not being available so try to minimize the time the tserver is off-line. From: Ligade, Shailesh [USA] <ligade_shail...@bah.com<mailto:ligade_shail...@bah.com>> Sent: Monday, November 29, 2021 9:19 AM To: user@accumulo.apache.org<mailto:user@accumulo.apache.org> Subject: accumulo tserver rolling restart Hello, I want to restart al the tservers, say I updated the tserver heap size. Since we ar eusing system, I can issue restart command on a tserver. This causes all sorts of tablet movements even though accumulo is down for may be a second. If I wait for all unassigned tables to become 0, then to restart next tserver, then to completely restart a small cluster (6-8 nodes) take hours (roughly 4k+ tablets per tserver) What may be right way to perform such routine maintenance operation? Is there a delay setting we can change so that it will not move tablets around? What may be a safe delay value? -S