Just thinking about other ways that might work - have not tried any of this, so safe may be relative...
Sometimes it seems easier to give Accumulo what it wants rather than fighting it - an example would be when you have a "missing" file - you can add an "empty" file to serve as a placeholder and things will progress. With that as an analogy - what if you synthetically added data that corresponded to the splits that it is looking for? If you added rows, with a TTL that was expired - or very short then it should not be returned in queries - and once compacted should go away. If you use visibilities, you could pick a value that would be inaccessible to users. If you can use visibilities you may want to use a TTL to keep the entries around long enough to complete whatever you need to do to get the splits back to what you want. That way the balancer would have the rows even if a compaction ran. If the incorrectly named splits will sort to a range, then clean-up could be easier - or you can scan using the fake visibility and that should only return the synthetic rows - or just keep track of what you added. With the "missing" splits added, then maybe the balancer will complete faster and settle down, you could then work to merge those splits away. Merging is usually not a speedy operation - running a compaction before the merge can sometimes help. Ed Coleman -----Original Message----- From: McClure, Bruce MR 2 <[email protected]> Sent: Monday, November 22, 2021 6:15 PM To: [email protected] Subject: RE: Triggering Table Balancer in Accumulo [SEC=UNOFFICIAL] UNOFFICIAL Hi, After looking at the master logs, I can see that the custom balancer is running every few minutes as you said, but reporting problems with some splits that do not conform to the expected naming scheme for the rows (non-existent row-id). I also see errors and warnings in the tserver logs "Failed to find midpoint using indexes, falling back to data files which is slower. No entries between ...", which reference the same incorrectly named splits that the balancer is complaining about. Attempts have been made to merge these incorrect and empty splits (which were created by human error) out of the system by merging a range either-side of the bad split. However, this has taken a very long time (multiple hours) to run for a single range and there are quite a number of them. QUESTION: Is there a safe, relatively quick way to remove manually created splits that were created with the addsplits accumulo shell command? Thanks, Bruce. -----Original Message----- From: Christopher <[email protected]> Sent: Monday, 1 November 2021 10:40 PM To: accumulo-user <[email protected]> Subject: Re: Triggering Table Balancer in Accumulo [SEC=UNOFFICIAL] EXTERNAL EMAIL: Do not click any links or open any attachments unless you trust the sender and know the content is safe. Hi Bruce, We don't have an API for forcing the balancer to rebalance, but I believe it automatically runs every couple of minutes. So, it should get frequent opportunities to rebalance. It shouldn't be necessary to force a rebalance, if your balancer logic takes into account all the factors you care about. If you need to force it, killing a tserver and allowing it's tablets to be reassigned can be relatively unintrusive, provided you don't have a lot of ingest going on, and your tables are flushed (to avoid WAL recovery costs). Another way might be to take the table offline and back online again, but that feels more intrusive to me, because it would affect an entire table. You could also manipulate the metadata table for the tablet to remove the saved location information while it is offline (don't do this while it is online), in order to avoid tablets from just being reassigned back to their previous servers. Regarding the empty splits, Accumulo generally balances tablets without regard to their contents, because we can't know how the application intends to use the splits (I say generally, because a custom balancer could be written to do anything). It's expected that the application's schema and the user's choice of manual splits reflect their preferred distribution of data across tablets, so the balancer only has to care about the number of tablets without regard to what they contain. You can merge empty tablets away if you don't need them, especially for pre-splits that you didn't end up using, but this incurs a cost in terms of chop-compactions on adjacent tablets. This might be acceptable. There has been some discussion about a feature to avoid chop-compactions, which would be nice because it would make merges much more instantaneous and cost-free, but it is not implemented yet. On Sun, Oct 31, 2021 at 8:59 PM McClure, Bruce MR 2 <[email protected]> wrote: > > UNOFFICIAL > > Hi, > > > > I have a custom table balancer set on a particular table, and a cron job that > creates splits for the next-days data, each day. Normally it is all fine, > but after some problems happened, I found that for certain days all the > splits resided on a single tablet server – which then caused performance > problems with ingest. This was solved by temporarily taking the tablet > server out of the cluster (stopping the Accumulo service not HDFS) and then > (days) later putting it back. This caused a re-assignment of the tablets and > presumably triggered the table balancer as part of that. This seemed like a > very heavy-handed solution and brought about the question: > > > > What is the recommended (least intrusive) way to trigger the table balancer > in Accumulo for a known set of splits (tablets)? > > > > Additional information: whilst the cluster is well balanced in terms of > tablets-per-server, there is an imbalance in terms of entries (3-1 or 5-1 in > some cases). I noticed that the new (empty) pre-splits appeared to be on the > server or servers with significantly less entries. > > > > Thank you in advance. > > > > Bruce. > > > > > >
