weizuo93 opened a new issue #4997:
URL: https://github.com/apache/incubator-doris/issues/4997
The rows deleted by `delete operation` will not be deleted from the disk
untill base compaction for the relevant tablet is performed. The data deleted
logically not only occupies disk space, but also has an effects on scan
performance. So it is necessary to perform compaction task for the tablet that
contains a lot of deleted rows.
Can we take 'rows_del_filtered' into consideration when selecting a tablet
for compaction task?
For a tablet, we can record the filtered rows during scan operation since
last base compaction, and take the filtered rows as a consideration factor when
selecting a tablet for compaction task. `tablet score` for compaction can be
calculated like this:
`tablet_score = k1 * tablet_scan_frequency + k2 * old_compaction_score +
k3 * rows_del_filtered`
`k1`,`k2`and `k3`can be set dynamically through http interface
`/api/update_config`.
Of course, the impact on scan performance is different between rows in
`DEL_PARTIAL_SATISFIED`blocks and those in `DEL_SATISFIED` blocks , and can be
treated separately.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]