[ https://issues.apache.org/jira/browse/KUDU-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17631825#comment-17631825 ]
ASF subversion and git services commented on KUDU-3419: ------------------------------------------------------- Commit 4a8931636eb79d2a3fdee478d7b9e3bc890758b7 in kudu's branch refs/heads/master from xinghuayu007 [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=4a8931636 ] [KUDU-3419] Fix the stuck of starting tablet server If the permission of tablet metadata file is denied when tablet server starting,tablet server will get stuck and can not exit automatically. That is because if the permission is denied, the object of TabletServer will be deconstructed, it will deconstruct its thread pool:txn_status_manager_pool_, but this thread pool contains a task: TxnStalenessTrackerTask, the task is still running. There are no code to shutdown the task in that case. Therefore tablet server will get stuck. See #KUDU-3419 for detail Change-Id: I8c9a4f4158fcb0a36499345e00ee72c65f5fefe0 Reviewed-on: http://gerrit.cloudera.org:8080/19203 Reviewed-by: Yingchun Lai <acelyc1112...@gmail.com> Reviewed-by: Alexey Serbin <ale...@apache.org> Tested-by: Alexey Serbin <ale...@apache.org> > Tablet server maybe get stuck when loading tablet metadata failed > ----------------------------------------------------------------- > > Key: KUDU-3419 > URL: https://issues.apache.org/jira/browse/KUDU-3419 > Project: Kudu > Issue Type: Bug > Reporter: Xixu Wang > Priority: Major > Attachments: image-2022-11-04-14-57-49-684.png, > image-2022-11-04-14-59-54-665.png, image-2022-11-04-15-25-05-437.png, > image-2022-11-04-15-29-27-092.png, image-2022-11-04-15-30-08-892.png, > image-2022-11-04-15-32-34-366.png > > > Tablet server maybe get stuck when loading tablet metadata failed. > The follow steps repeat the bug. > 1. Change the permission of one tablet meta file to root. We use account: > *kudu* to run Kudu. > !image-2022-11-04-14-57-49-684.png! > 2.Start an instance of tablet server. A permission erro will be saw: > !image-2022-11-04-15-29-27-092.png! > 3. Tablet server gets stuck and will not exit automatically. > !image-2022-11-04-15-30-08-892.png! > 4. Pstack is as follow: > As we can see. Tablet Server can not exit, because ThreadPool can not be > shutdown. TxnStatlessTrasckerTask is running, which cause threadpool can not > be shutdown. > !image-2022-11-04-15-32-34-366.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)