[ https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725276#comment-16725276 ]
Adar Dembo commented on KUDU-2638: ---------------------------------- Thank you for the log. Although each server has 12 disks, it would seem that Kudu is configured to use just one for its data directories: bq. --fs_data_dirs=/data1/data/kudu/tserver-new This will have a dramatic impact on overall Kudu performance. Firstly, Kudu will only bootstrap one tablet at a time (see the documentation for {{--num_tablets_to_open_simultaneously}}, which helps explain why your tablets take so long to bootstrap. Secondly, your overall disk bandwidth is very low, so maintenance manager flush/compact operations are much slower than they otherwise would be. If you upgrade to Kudu 1.7 or 1.8 and rebuild your tservers (one at a time), Kudu's metadata will be stored on the same disk as the WALs rather than the first data directory. In your case, with only one data directory, having the metadata colocated with all of Kudu's data is going to make all flush/compact operations slower (as they need to rewrite the tablet superblocks). Another thing that stands out to me is the relative size of each Kudu data block: {quote} 1 data directories: /data1/data/kudu/tserver-new/data Total live blocks: 19299871 Total live bytes: 102086799764 Total live bytes (after alignment): 176281313280 Total number of LBM containers: 226 (17 full) {quote} This works out to a couple KB per data block. Ideally data blocks would be larger, closer to 1 MB each. Having so many small data blocks means more overhead elsewhere in the system. Finally, as you pointed out, the number of delta compaction operations is quite high, as is the number of DMS flushes. What kind of workload is this? It seems to be dominated by UPDATEs, which isn't optimal for Kudu. > kudu cluster restart very long time to reused > --------------------------------------------- > > Key: KUDU-2638 > URL: https://issues.apache.org/jira/browse/KUDU-2638 > Project: Kudu > Issue Type: Improvement > Reporter: jiaqiyang > Priority: Major > Fix For: n/a > > Attachments: kudu16.tc.tablet.png, tserverLog.tar.gz > > > when restart my kudu cluster ;all tablet not avalible: > run kudu cluster ksck show that: > Table Summary > > > Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable > --------------------------------------------------------------------------------+------------ > t1 | HEALTHY | 1 | 1 | 0 | 0 > t2 | UNAVAILABLE | 5 | 0 | 1 | 4 > t3 | UNAVAILABLE | 6 | 2 | 0 | 4 > t3 | UNAVAILABLE | 3 | 0 | 0 | 3 -- This message was sent by Atlassian JIRA (v7.6.3#76005)