[
https://issues.apache.org/jira/browse/KUDU-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281135#comment-15281135
]
Todd Lipcon commented on KUDU-1451:
-----------------------------------
Hmm, this seems to be due to KUDU-941, which made tablet deletion lazy (on the
next start) rather than right when the table is deleted, in order to avoid
spurious remote bootstraps getting triggered while a table was being deleted.
However, it seems like whatever we attempted to do with KUDU-941 didn't really
work, because we reported it again as KUDU-1337, and fixed it a different way
(by having the tserver cache the ids of recently deleted tablets). So, I'm
thinking maybe we should just be reverting KUDU-941? [~mpercy] any thoughts
here? It seems you did the patches for both of those previous issues, so maybe
you remember why KUDU-941 wasn't sufficient, etc.
> Restarting a TS that had a lot of deleted tablets takes tens of minutes
> -----------------------------------------------------------------------
>
> Key: KUDU-1451
> URL: https://issues.apache.org/jira/browse/KUDU-1451
> Project: Kudu
> Issue Type: Bug
> Components: tablet
> Affects Versions: 0.8.0
> Reporter: Jean-Daniel Cryans
> Priority: Critical
>
> Running a workload that deletes and creates a new table with hundreds of
> tablets every few hours, I encounter this issue where if I haven't restarted
> the cluster in a while it'll take tens of minutes to process all the deleted
> tablets. The logs for a single tablet looks like:
> {noformat}
> I0511 19:03:52.380903 79512 ts_tablet_manager.cc:609] Loading metadata for
> tablet 28f7ae54ac1d413b8a0e694e1dcef0fc
> I0511 19:03:52.391885 79512 ts_tablet_manager.cc:937] T
> 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Tablet
> Manager startup: Rolling forward tablet deletion of type TABLET_DATA_DELETED
> I0511 19:03:52.391894 79512 ts_tablet_manager.cc:964] T
> 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Deleting
> tablet data with delete state TABLET_DATA_DELETED
> I0511 19:03:52.497952 79512 ts_tablet_manager.cc:974] T
> 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Tablet
> deleted. Last logged OpId: 406.65248
> I0511 19:03:52.498010 79512 ts_tablet_manager.cc:946] T
> 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Deleting
> tablet superblock
> {noformat}
> In my latest instance of this problem, running on
> 43c9c87604f3b6f3dd286c63344bf18a2db08c21, it took almost 20 minutes to
> process 18k deleted tablets... then the TS can start bootstrapping.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)