[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact
[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884780#comment-17884780 ] ASF subversion and git services commented on KUDU-3367: --- Commit 05043e6aba6ab45c1b77de9f0762de3dfc5a54c0 in kudu's branch refs/heads/branch-1.17.x from Alexey Serbin [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=05043e6ab ] KUDU-3619 disable KUDU-3367 behavior by default As it turned out, KUDU-3367 has introduced a regression due to a deficiency in its implementation, where major compactions would fail with errors like below if it had kicked in: Corruption: Failed major delta compaction on RowSet(1): No min key found: CFile base data in RowSet(1) Since KUDU-3367 isn't quite relevant in Kudu versions of 1.12.0 and newer when working with data that supports live row count (see KUDU-1625), a quick-and-dirty fix is to set the default value for the corresponding flag --all_delete_op_delta_file_cnt_for_compaction to a value that effectively disables KUDU-3367 behavior. This patch does exactly so. Change-Id: Iec0719462e379b7a0fb05ca011bb9cdd991a58ef Reviewed-on: http://gerrit.cloudera.org:8080/21848 Reviewed-by: KeDeng Tested-by: Alexey Serbin (cherry picked from commit 3666d2026d48adb5ff636321ef22320a8af5facb) Conflicts: src/kudu/tablet/delta_tracker.cc Reviewed-on: http://gerrit.cloudera.org:8080/21855 Reviewed-by: Abhishek Chennaka > Delta file with full of delete op can not be schedule to compact > > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction >Reporter: dengke >Assignee: dengke >Priority: Major > Fix For: 1.17.0 > > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact
[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884686#comment-17884686 ] ASF subversion and git services commented on KUDU-3367: --- Commit 3666d2026d48adb5ff636321ef22320a8af5facb in kudu's branch refs/heads/master from Alexey Serbin [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=3666d2026 ] KUDU-3619 disable KUDU-3367 behavior by default As it turned out, KUDU-3367 has introduced a regression due to a deficiency in its implementation, where major compactions would fail with errors like below if it had kicked in: Corruption: Failed major delta compaction on RowSet(1): No min key found: CFile base data in RowSet(1) Since KUDU-3367 isn't quite relevant in Kudu versions of 1.12.0 and newer when working with data that supports live row count (see KUDU-1625), a quick-and-dirty fix is to set the default value for the corresponding flag --all_delete_op_delta_file_cnt_for_compaction to a value that effectively disables KUDU-3367 behavior. This patch does exactly so. Change-Id: Iec0719462e379b7a0fb05ca011bb9cdd991a58ef Reviewed-on: http://gerrit.cloudera.org:8080/21848 Reviewed-by: KeDeng Tested-by: Alexey Serbin > Delta file with full of delete op can not be schedule to compact > > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction >Reporter: dengke >Assignee: dengke >Priority: Major > Fix For: 1.17.0 > > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact
[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17655527#comment-17655527 ] ASF subversion and git services commented on KUDU-3367: --- Commit 27072d3382889b1852f4fef58010115585685bd3 in kudu's branch refs/heads/master from Yingchun Lai [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=27072d338 ] [tools] Add 'kudu local_replica tmeta delete_rowsets' to delete rowsets from tablet There are some use cases we need to delete rowsets from a tablet. For example: 1. Some blocks are corrupted in a single node cluster, the server cannot be started. Note: some data will be lost in this case. 2. Some rowsets are fully deleted but the blocks can not be GCed (KUDU-3367). Note: no data will be lost in this case. There is 'kudu pbc edit' CLI tool to achieve that, but it's error prone and hard to operate when working with large amount of data. This patch introduces a new CLI tool 'kudu local_replica tmeta delete_rowsets' which makes removing rowsets from a tablet much easier. Change-Id: If2cf9035babf4c3af4c238cebe8dcecd2c65848f Reviewed-on: http://gerrit.cloudera.org:8080/19357 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin > Delta file with full of delete op can not be schedule to compact > > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction >Reporter: dengke >Assignee: dengke >Priority: Major > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact
[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17649548#comment-17649548 ] ASF subversion and git services commented on KUDU-3367: --- Commit ad920e69fcd67ceefa25ea81a38a10a27d9e3afc in kudu's branch refs/heads/master from kedeng [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=ad920e69f ] KUDU-3367 [compaction] add supplement to gc algorithm If we get a REDO delta full of delete ops, which means there is not a single update operation in the delta. The current compaction algorithm doesn't run GC on such deltamemstores. The accumulation of deltamemstores like that negatively affects performance of scan operations. This patch as a supplement to KUDU-1625, we could release storage space for old tablet metadata that does not support the live count function. See KUDU-3367 for details. Change-Id: I8b26737dffecc17688b42188da959b2ba16351ed Reviewed-on: http://gerrit.cloudera.org:8080/18503 Reviewed-by: Alexey Serbin Tested-by: Alexey Serbin > Delta file with full of delete op can not be schedule to compact > > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction >Reporter: dengke >Assignee: dengke >Priority: Major > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact
[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633680#comment-17633680 ] YifanZhang commented on KUDU-3367: -- [~Koppa] [~laiyingchun] Ah, indeed, this GC operation relies on live row counting. I agree that we do need GC deleted rows on tablets that don't support live row counting. > Delta file with full of delete op can not be schedule to compact > > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction >Reporter: dengke >Assignee: dengke >Priority: Major > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact
[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633469#comment-17633469 ] dengke commented on KUDU-3367: -- Yes. It didn't happen again for a long time after this happened before, until last week. I tried to set `tablet_history_max_age_sec` to a small value according to what [~zhangyifan27] said. After observing for a while, I found that it had no effect. The code implementation of KUDU-1625 is based on 'live row count', but I found that this version is too old to support this feature in the environment with problems: !image-2022-11-14-11-02-33-685.png! So I think it is necessary to develop new processing methods for the data of the old version Kudu. > Delta file with full of delete op can not be schedule to compact > > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction >Reporter: dengke >Assignee: dengke >Priority: Major > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact
[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633458#comment-17633458 ] Yingchun Lai commented on KUDU-3367: [~zhangyifan27] KUDU-1625 depends the tablet supports 'live row count' (which is introduced since Kudu 1.12 ?), even if upgrading Kudu to a higher version, the old exists tablet still doesn't have such metadata, so the DeletedRowsetGCOp will not work on these tablets. I guess [~Koppa] is trying to make these old tablet is able to GC such rowsets whose rows full deleted, right? > Delta file with full of delete op can not be schedule to compact > > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction >Reporter: dengke >Assignee: dengke >Priority: Major > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact
[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544095#comment-17544095 ] YifanZhang commented on KUDU-3367: -- Maybe related to KUDU-1625. > Delta file with full of delete op can not be schedule to compact > > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction >Reporter: dengke >Assignee: dengke >Priority: Major > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact
[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543808#comment-17543808 ] YifanZhang commented on KUDU-3367: -- I'm curious about if setting `tablet_history_max_age_sec` to a small value is helpful for your case. If so, will DeletedRowsetGCOp be scheduled and empty RowSets be deleted in time? > Delta file with full of delete op can not be schedule to compact > > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction >Reporter: dengke >Assignee: dengke >Priority: Major > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact
[ https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533640#comment-17533640 ] dengke commented on KUDU-3367: -- The following is the specific scene description at that time. We found a table with a small amount of data, the scanning took a long time. !image-2022-05-09-14-13-16-525.png! Do perf table_scan when the cluster is idle, we can find that the disk IO utilization of the tablet is very high. !image-2022-05-09-14-16-31-828.png! Check the scan service threads: !image-2022-05-09-14-18-05-647.png! Check the system calls by strace, they are eading files frequently: !image-2022-05-09-14-19-56-933.png! Check the layout of the tablet, the average height of rowset approaches 1: !image-2022-05-09-14-21-47-374.png! Check the size of each stored file: !image-2022-05-09-14-23-43-973.png! Do data sampling of rowset file. We found the sampled data is all in delete state and the "write time" of all data is 2018-08-01. !image-2022-05-09-14-26-45-313.png! We analyze the data form again, and found that there is base data, but no undo exist, but with many redo delta with full of delete op. !image-2022-05-09-14-32-51-573.png! We check the code: {code:java} // double DiskRowSet::DeltaStoresCompactionPerfImprovementScore(DeltaCompactionType type) const { DCHECK(open_); double perf_improv = 0; size_t store_count = CountDeltaStores(); if (store_count == 0) { return perf_improv; } if (type == RowSet::MAJOR_DELTA_COMPACTION) { vector col_ids_with_updates; // We get col ids with update,but there is no column update with delete op. delta_tracker_->GetColumnIdsWithUpdates(&col_ids_with_updates); // If we have files but no updates, we don't want to major compact. if (!col_ids_with_updates.empty()) { // Delete op can not reach,which means the perf_improv will be 0. DiskRowSetSpace drss; GetDiskRowSetSpaceUsage(&drss); double ratio = static_cast(drss.redo_deltas_size) / drss.base_data_size; if (ratio >= FLAGS_tablet_delta_store_major_compact_min_ratio) { perf_improv = ratio; } } } else if (type == RowSet::MINOR_DELTA_COMPACTION) { if (store_count > 1) { perf_improv = static_cast(store_count) / FLAGS_tablet_delta_store_minor_compact_max; } } else { LOG_WITH_PREFIX(FATAL) << "Unknown delta compaction type " << type; } return std::min(1.0, perf_improv); } {code} So we can get the following conclusions : If we get a REDO delta with full of delete op, wich means there is no update op in the file. The current compact algorithm will not schedule the file do compact. If such files exist, after accumulating for a period of time, it will greatly affect our scan speed. {{}} > Delta file with full of delete op can not be schedule to compact > > > Key: KUDU-3367 > URL: https://issues.apache.org/jira/browse/KUDU-3367 > Project: Kudu > Issue Type: New Feature > Components: compaction >Reporter: dengke >Priority: Major > Attachments: image-2022-05-09-14-13-16-525.png, > image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, > image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, > image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, > image-2022-05-09-14-32-51-573.png > > > If we get a REDO delta with full of delete op, wich means there is no update > op in the file. The current compact algorithm will not schedule the file do > compact. If such files exist, after accumulating for a period of time, it > will greatly affect our scan speed. However, processing such files every time > compact reduces compact's performance. -- This message was sent by Atlassian Jira (v8.20.7#820007)