[ https://issues.apache.org/jira/browse/KUDU-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
shenxingwuying updated KUDU-3364: --------------------------------- Attachment: image-2022-06-01-11-02-20-888.png > Add TimerThread to ThreadPool to support a category of problem > -------------------------------------------------------------- > > Key: KUDU-3364 > URL: https://issues.apache.org/jira/browse/KUDU-3364 > Project: Kudu > Issue Type: New Feature > Reporter: shenxingwuying > Assignee: shenxingwuying > Priority: Minor > Attachments: image-2022-06-01-11-02-20-888.png > > Original Estimate: 168h > Remaining Estimate: 168h > > h1. Scenanios > In general, I am talking about a category of problem. > There are some periodic tasks or automatically triggered scheduling tasks in > kudu. > For example, automatic rebalance of cluster data, some GC task and compaction > tasks. > Their implementation is by kudu Thread, maybe std::thread or ThreadPool, the > really task internally periodic scheduled or internally strategy to trigge > execution. > They are all internal, we cann't do some. > In fact, we need a method our control to trigge the above types of actions. > In general, I am talking about a category of problem. > Some scenarios is significant. > Below is examples: > > h2. data rebalance > There are two rebalance ways: > 1. enable auto rebalance > 2. use rebalance tool 1.14 before. > The two ways maybe exist some conflicts at opeations race, because rebalance > tool' logic is a litte complex at tool and auto rebalance is running at > master. > In future, auto rebalance at master will become very steady and become the > main way for data rebalance. And at the same time, admin opers need a > external trigger the rebalance just like auto rebalance. > But, now auto rebalance is running in a thread and by time period. > Although we can add a api for MasterService, but the api is synchronize, and > will cose very much, we need a asynchronized method to trigger the rebalance. > h2. auto compaction > Another example is auto compaction, > I have found compaction strategy is not always valid, so maybe we need a > method controlled by admin users to triggle compaction. > If we can do a RowSetInCompaction, we need not restart the kudu cluster. > h1. > h1. My Solution > Add a timer in ThreadPool. This timer is a worker thread that schedules tasks > to the specified thread according to time. > We can limit only SERIAL ThreadPoolToken can enable TimerThread. > Pseudo code expresses my intention: > {code:java} > //代码占位符 > class TimerThread { > class Task { > ThreadPoolToken token; > std::function<void()> f; > }; > > void Schedule(Task task, int delay_ms) { > tasks_.insert(...); > } > void RunLoop() { > while (...) { > SleepFor(100ms); > tasks = FindTasks(); > for (auto task : tasks) { > token = task.token; > token->Submit(task.f); > tasks_.erase... > } > } > } > scoped_refptr<Thread> thread_; > std::multimap<MonoTime, Task> tasks; > }; > class ThreadPool{ > ... > TimerThread* timer_; > ... > }; > class ThreadPoolToken { > void Scheduler(); > };{code} > This scheme can be compatible with the previous ThreadPool, and timer is > nullptr by default. > For periodic tasks, We can use a Control ThreadPool with timer to refact some > codes to make them more clear, to avoid the problem of too many single > threads in the past. -- This message was sent by Atlassian Jira (v8.20.7#820007)