[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

2024-09-25 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884780#comment-17884780
 ] 

ASF subversion and git services commented on KUDU-3367:
---

Commit 05043e6aba6ab45c1b77de9f0762de3dfc5a54c0 in kudu's branch 
refs/heads/branch-1.17.x from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=05043e6ab ]

KUDU-3619 disable KUDU-3367 behavior by default

As it turned out, KUDU-3367 has introduced a regression due to
a deficiency in its implementation, where major compactions would fail
with errors like below if it had kicked in:

  Corruption: Failed major delta compaction on RowSet(1): No min key found: 
CFile base data in RowSet(1)

Since KUDU-3367 isn't quite relevant in Kudu versions of 1.12.0 and
newer when working with data that supports live row count (see
KUDU-1625), a quick-and-dirty fix is to set the default value for the
corresponding flag --all_delete_op_delta_file_cnt_for_compaction
to a value that effectively disables KUDU-3367 behavior.
This patch does exactly so.

Change-Id: Iec0719462e379b7a0fb05ca011bb9cdd991a58ef
Reviewed-on: http://gerrit.cloudera.org:8080/21848
Reviewed-by: KeDeng 
Tested-by: Alexey Serbin 
(cherry picked from commit 3666d2026d48adb5ff636321ef22320a8af5facb)
  Conflicts:
src/kudu/tablet/delta_tracker.cc
Reviewed-on: http://gerrit.cloudera.org:8080/21855
Reviewed-by: Abhishek Chennaka 


> Delta file with full of delete op can not be schedule to compact
> 
>
> Key: KUDU-3367
> URL: https://issues.apache.org/jira/browse/KUDU-3367
> Project: Kudu
>  Issue Type: New Feature
>  Components: compaction
>Reporter: dengke
>Assignee: dengke
>Priority: Major
> Fix For: 1.17.0
>
> Attachments: image-2022-05-09-14-13-16-525.png, 
> image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, 
> image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, 
> image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, 
> image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update 
> op in the file. The current compact algorithm will not schedule the file do 
> compact. If such files exist, after accumulating for a period of time, it 
> will greatly affect our scan speed. However, processing such files every time 
> compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

2024-09-25 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884686#comment-17884686
 ] 

ASF subversion and git services commented on KUDU-3367:
---

Commit 3666d2026d48adb5ff636321ef22320a8af5facb in kudu's branch 
refs/heads/master from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=3666d2026 ]

KUDU-3619 disable KUDU-3367 behavior by default

As it turned out, KUDU-3367 has introduced a regression due to
a deficiency in its implementation, where major compactions would fail
with errors like below if it had kicked in:

  Corruption: Failed major delta compaction on RowSet(1): No min key found: 
CFile base data in RowSet(1)

Since KUDU-3367 isn't quite relevant in Kudu versions of 1.12.0 and
newer when working with data that supports live row count (see
KUDU-1625), a quick-and-dirty fix is to set the default value for the
corresponding flag --all_delete_op_delta_file_cnt_for_compaction
to a value that effectively disables KUDU-3367 behavior.
This patch does exactly so.

Change-Id: Iec0719462e379b7a0fb05ca011bb9cdd991a58ef
Reviewed-on: http://gerrit.cloudera.org:8080/21848
Reviewed-by: KeDeng 
Tested-by: Alexey Serbin 


> Delta file with full of delete op can not be schedule to compact
> 
>
> Key: KUDU-3367
> URL: https://issues.apache.org/jira/browse/KUDU-3367
> Project: Kudu
>  Issue Type: New Feature
>  Components: compaction
>Reporter: dengke
>Assignee: dengke
>Priority: Major
> Fix For: 1.17.0
>
> Attachments: image-2022-05-09-14-13-16-525.png, 
> image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, 
> image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, 
> image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, 
> image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update 
> op in the file. The current compact algorithm will not schedule the file do 
> compact. If such files exist, after accumulating for a period of time, it 
> will greatly affect our scan speed. However, processing such files every time 
> compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

2023-01-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17655527#comment-17655527
 ] 

ASF subversion and git services commented on KUDU-3367:
---

Commit 27072d3382889b1852f4fef58010115585685bd3 in kudu's branch 
refs/heads/master from Yingchun Lai
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=27072d338 ]

[tools] Add 'kudu local_replica tmeta delete_rowsets' to delete rowsets from 
tablet

There are some use cases we need to delete rowsets from a tablet.
For example:
1. Some blocks are corrupted in a single node cluster, the server cannot be
   started. Note: some data will be lost in this case.
2. Some rowsets are fully deleted but the blocks can not be GCed (KUDU-3367).
   Note: no data will be lost in this case.

There is 'kudu pbc edit' CLI tool to achieve that, but it's error prone and
hard to operate when working with large amount of data.

This patch introduces a new CLI tool 'kudu local_replica tmeta delete_rowsets'
which makes removing rowsets from a tablet much easier.

Change-Id: If2cf9035babf4c3af4c238cebe8dcecd2c65848f
Reviewed-on: http://gerrit.cloudera.org:8080/19357
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin 


> Delta file with full of delete op can not be schedule to compact
> 
>
> Key: KUDU-3367
> URL: https://issues.apache.org/jira/browse/KUDU-3367
> Project: Kudu
>  Issue Type: New Feature
>  Components: compaction
>Reporter: dengke
>Assignee: dengke
>Priority: Major
> Attachments: image-2022-05-09-14-13-16-525.png, 
> image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, 
> image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, 
> image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, 
> image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update 
> op in the file. The current compact algorithm will not schedule the file do 
> compact. If such files exist, after accumulating for a period of time, it 
> will greatly affect our scan speed. However, processing such files every time 
> compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

2022-12-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17649548#comment-17649548
 ] 

ASF subversion and git services commented on KUDU-3367:
---

Commit ad920e69fcd67ceefa25ea81a38a10a27d9e3afc in kudu's branch 
refs/heads/master from kedeng
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=ad920e69f ]

KUDU-3367 [compaction] add supplement to gc algorithm

If we get a REDO delta full of delete ops, which means there is
not a single update operation in the delta. The current compaction
algorithm doesn't run GC on such deltamemstores. The accumulation
of deltamemstores like that negatively affects performance of scan
operations.

This patch as a supplement to KUDU-1625, we could  release storage
space for old tablet metadata that does not support the live count
function. See KUDU-3367 for details.

Change-Id: I8b26737dffecc17688b42188da959b2ba16351ed
Reviewed-on: http://gerrit.cloudera.org:8080/18503
Reviewed-by: Alexey Serbin 
Tested-by: Alexey Serbin 


> Delta file with full of delete op can not be schedule to compact
> 
>
> Key: KUDU-3367
> URL: https://issues.apache.org/jira/browse/KUDU-3367
> Project: Kudu
>  Issue Type: New Feature
>  Components: compaction
>Reporter: dengke
>Assignee: dengke
>Priority: Major
> Attachments: image-2022-05-09-14-13-16-525.png, 
> image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, 
> image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, 
> image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, 
> image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update 
> op in the file. The current compact algorithm will not schedule the file do 
> compact. If such files exist, after accumulating for a period of time, it 
> will greatly affect our scan speed. However, processing such files every time 
> compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

2022-11-14 Thread YifanZhang (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633680#comment-17633680
 ] 

YifanZhang commented on KUDU-3367:
--

[~Koppa] [~laiyingchun] Ah, indeed, this GC operation relies on live row 
counting. I agree that we do need GC deleted rows on tablets that don't support 
live row counting. 

> Delta file with full of delete op can not be schedule to compact
> 
>
> Key: KUDU-3367
> URL: https://issues.apache.org/jira/browse/KUDU-3367
> Project: Kudu
>  Issue Type: New Feature
>  Components: compaction
>Reporter: dengke
>Assignee: dengke
>Priority: Major
> Attachments: image-2022-05-09-14-13-16-525.png, 
> image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, 
> image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, 
> image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, 
> image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update 
> op in the file. The current compact algorithm will not schedule the file do 
> compact. If such files exist, after accumulating for a period of time, it 
> will greatly affect our scan speed. However, processing such files every time 
> compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

2022-11-13 Thread dengke (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633469#comment-17633469
 ] 

dengke commented on KUDU-3367:
--

Yes. It didn't happen again for a long time after this happened before, until 
last week. I tried to set `tablet_history_max_age_sec` to a small value 
according to what [~zhangyifan27]  said. After observing for a while, I found 
that it had no effect.

The code implementation of KUDU-1625  is based on 'live row count', but I found 
that this version is too old to support this feature in the environment with 
problems:

!image-2022-11-14-11-02-33-685.png!

So I think it is necessary to develop new processing methods for the data of 
the old version Kudu.

> Delta file with full of delete op can not be schedule to compact
> 
>
> Key: KUDU-3367
> URL: https://issues.apache.org/jira/browse/KUDU-3367
> Project: Kudu
>  Issue Type: New Feature
>  Components: compaction
>Reporter: dengke
>Assignee: dengke
>Priority: Major
> Attachments: image-2022-05-09-14-13-16-525.png, 
> image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, 
> image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, 
> image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, 
> image-2022-05-09-14-32-51-573.png, image-2022-11-14-11-02-33-685.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update 
> op in the file. The current compact algorithm will not schedule the file do 
> compact. If such files exist, after accumulating for a period of time, it 
> will greatly affect our scan speed. However, processing such files every time 
> compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

2022-11-13 Thread Yingchun Lai (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633458#comment-17633458
 ] 

Yingchun Lai commented on KUDU-3367:


[~zhangyifan27]  KUDU-1625 depends the tablet supports 'live row count' (which 
is introduced since Kudu 1.12 ?), even if upgrading Kudu to a higher version, 
the old exists tablet still doesn't have such metadata, so the 
DeletedRowsetGCOp will not work on these tablets.

I guess [~Koppa] is trying to make these old tablet is able to GC such rowsets 
whose rows full deleted, right?

> Delta file with full of delete op can not be schedule to compact
> 
>
> Key: KUDU-3367
> URL: https://issues.apache.org/jira/browse/KUDU-3367
> Project: Kudu
>  Issue Type: New Feature
>  Components: compaction
>Reporter: dengke
>Assignee: dengke
>Priority: Major
> Attachments: image-2022-05-09-14-13-16-525.png, 
> image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, 
> image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, 
> image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, 
> image-2022-05-09-14-32-51-573.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update 
> op in the file. The current compact algorithm will not schedule the file do 
> compact. If such files exist, after accumulating for a period of time, it 
> will greatly affect our scan speed. However, processing such files every time 
> compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

2022-05-30 Thread YifanZhang (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544095#comment-17544095
 ] 

YifanZhang commented on KUDU-3367:
--

Maybe related to KUDU-1625.

> Delta file with full of delete op can not be schedule to compact
> 
>
> Key: KUDU-3367
> URL: https://issues.apache.org/jira/browse/KUDU-3367
> Project: Kudu
>  Issue Type: New Feature
>  Components: compaction
>Reporter: dengke
>Assignee: dengke
>Priority: Major
> Attachments: image-2022-05-09-14-13-16-525.png, 
> image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, 
> image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, 
> image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, 
> image-2022-05-09-14-32-51-573.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update 
> op in the file. The current compact algorithm will not schedule the file do 
> compact. If such files exist, after accumulating for a period of time, it 
> will greatly affect our scan speed. However, processing such files every time 
> compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

2022-05-30 Thread YifanZhang (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543808#comment-17543808
 ] 

YifanZhang commented on KUDU-3367:
--

I'm curious about if setting `tablet_history_max_age_sec` to a small value is 
helpful for your case. If so, will DeletedRowsetGCOp be scheduled and empty 
RowSets be deleted in time?

> Delta file with full of delete op can not be schedule to compact
> 
>
> Key: KUDU-3367
> URL: https://issues.apache.org/jira/browse/KUDU-3367
> Project: Kudu
>  Issue Type: New Feature
>  Components: compaction
>Reporter: dengke
>Assignee: dengke
>Priority: Major
> Attachments: image-2022-05-09-14-13-16-525.png, 
> image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, 
> image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, 
> image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, 
> image-2022-05-09-14-32-51-573.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update 
> op in the file. The current compact algorithm will not schedule the file do 
> compact. If such files exist, after accumulating for a period of time, it 
> will greatly affect our scan speed. However, processing such files every time 
> compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (KUDU-3367) Delta file with full of delete op can not be schedule to compact

2022-05-08 Thread dengke (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533640#comment-17533640
 ] 

dengke commented on KUDU-3367:
--

The following is the specific scene description at that time.

We found a table with a small amount of data, the scanning took a long time.

!image-2022-05-09-14-13-16-525.png!

Do perf table_scan when the cluster is idle, we can find that the disk IO 
utilization of the tablet is very high.

!image-2022-05-09-14-16-31-828.png!

Check the  scan service threads:

!image-2022-05-09-14-18-05-647.png!

Check the system calls by strace, they are eading files frequently:

!image-2022-05-09-14-19-56-933.png!

Check the layout of the tablet, the average height of rowset approaches 1:

!image-2022-05-09-14-21-47-374.png!

Check the size of each stored file:

!image-2022-05-09-14-23-43-973.png!

Do data sampling of rowset file. We found the sampled data is all in delete 
state and the "write time" of all data is 2018-08-01.

!image-2022-05-09-14-26-45-313.png!

We analyze the data form again, and found that there is base data, but no undo 
exist, but with many redo delta with full of  delete op.

!image-2022-05-09-14-32-51-573.png!

We check the code:
{code:java}
//
double 
DiskRowSet::DeltaStoresCompactionPerfImprovementScore(DeltaCompactionType type) 
const {  
DCHECK(open_);
    double perf_improv = 0;
    size_t store_count = CountDeltaStores();
    if (store_count == 0) {
return perf_improv;
    }
    if (type == RowSet::MAJOR_DELTA_COMPACTION) {
vector col_ids_with_updates;
// We get col ids with update,but there is no column update with delete 
op.
delta_tracker_->GetColumnIdsWithUpdates(&col_ids_with_updates);
// If we have files but no updates, we don't want to major compact.
if (!col_ids_with_updates.empty()) {
    // Delete op can not reach,which means the perf_improv will be 0.
   DiskRowSetSpace drss;
   GetDiskRowSetSpaceUsage(&drss);
   double ratio = static_cast(drss.redo_deltas_size) / 
drss.base_data_size;
   if (ratio >= FLAGS_tablet_delta_store_major_compact_min_ratio) {
   perf_improv = ratio;
   }
}
    } else if (type == RowSet::MINOR_DELTA_COMPACTION) {
       if (store_count > 1) {
   perf_improv = static_cast(store_count) / 
FLAGS_tablet_delta_store_minor_compact_max;
   }
    } else {
   LOG_WITH_PREFIX(FATAL) << "Unknown delta compaction type " << type;
    }
    return std::min(1.0, perf_improv);
} {code}
So we can get the following conclusions : If we get a REDO delta with full of 
delete op, wich means there is no update op in the file. The current compact 
algorithm will not schedule the file do compact.  If such files exist, after 
accumulating for a period of time, it will greatly affect our scan speed. 
{{}}
 

 

> Delta file with full of delete op can not be schedule to compact
> 
>
> Key: KUDU-3367
> URL: https://issues.apache.org/jira/browse/KUDU-3367
> Project: Kudu
>  Issue Type: New Feature
>  Components: compaction
>Reporter: dengke
>Priority: Major
> Attachments: image-2022-05-09-14-13-16-525.png, 
> image-2022-05-09-14-16-31-828.png, image-2022-05-09-14-18-05-647.png, 
> image-2022-05-09-14-19-56-933.png, image-2022-05-09-14-21-47-374.png, 
> image-2022-05-09-14-23-43-973.png, image-2022-05-09-14-26-45-313.png, 
> image-2022-05-09-14-32-51-573.png
>
>
> If we get a REDO delta with full of delete op, wich means there is no update 
> op in the file. The current compact algorithm will not schedule the file do 
> compact. If such files exist, after accumulating for a period of time, it 
> will greatly affect our scan speed. However, processing such files every time 
> compact reduces  compact's performance.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)