[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533640#comment-17533640 ]
dengke commented on KUDU-3367: ------------------------------ The following is the specific scene description at that time. We found a table with a small amount of data, the scanning took a long time. !image-2022-05-09-14-13-16-525.png! Do perf table_scan when the cluster is idle, we can find that the disk IO utilization of the tablet is very high. !image-2022-05-09-14-16-31-828.png! Check the scan service threads: !image-2022-05-09-14-18-05-647.png! Check the system calls by strace, they are eading files frequently: !image-2022-05-09-14-19-56-933.png! Check the layout of the tablet, the average height of rowset approaches 1: !image-2022-05-09-14-21-47-374.png! Check the size of each stored file: !image-2022-05-09-14-23-43-973.png! Do data sampling of rowset file. We found the sampled data is all in delete state and the "write time" of all data is 2018-08-01. !image-2022-05-09-14-26-45-313.png! We analyze the data form again, and found that there is base data, but no undo exist, but with many redo delta with full of delete op. !image-2022-05-09-14-32-51-573.png! We check the code: {code:java} // double DiskRowSet::DeltaStoresCompactionPerfImprovementScore(DeltaCompactionType type) const { DCHECK(open_); double perf_improv = 0; size_t store_count = CountDeltaStores(); if (store_count == 0) { return perf_improv; } if (type == RowSet::MAJOR_DELTA_COMPACTION) { vector<ColumnId> col_ids_with_updates; // We get col ids with update,but there is no column update with delete op. delta_tracker_->GetColumnIdsWithUpdates(&col_ids_with_updates); // If we have files but no updates, we don't want to major compact. if (!col_ids_with_updates.empty()) { // Delete op can not reach,which means the perf_improv will be 0. DiskRowSetSpace drss; GetDiskRowSetSpaceUsage(&drss); double ratio = static_cast<double>(drss.redo_deltas_size) / drss.base_data_size; if (ratio >= FLAGS_tablet_delta_store_major_compact_min_ratio) { perf_improv = ratio; } } } else if (type == RowSet::MINOR_DELTA_COMPACTION) { if (store_count > 1) { perf_improv = static_cast<double>(store_count) / FLAGS_tablet_delta_store_minor_compact_max; } } else { LOG_WITH_PREFIX(FATAL) << "Unknown delta compaction type " << type; } return std::min(1.0, perf_improv); } {code} So we can get the following conclusions : If we get a REDO delta with full of delete op, wich means there is no update op in the file. The current compact algorithm will not schedule the file do compact. If such files exist, after accumulating for a period of time, it will greatly affect our scan speed. {{}} > Delta file with full of delete op can not be schedule to compact > ---------------------------------------------------------------- > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction > Reporter: dengke > Priority: Major > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.7#820007)