[ 
https://issues.apache.org/jira/browse/KUDU-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281135#comment-15281135
 ] 

Todd Lipcon commented on KUDU-1451:
-----------------------------------

Hmm, this seems to be due to KUDU-941, which made tablet deletion lazy (on the 
next start) rather than right when the table is deleted, in order to avoid 
spurious remote bootstraps getting triggered while a table was being deleted. 
However, it seems like whatever we attempted to do with KUDU-941 didn't really 
work, because we reported it again as KUDU-1337, and fixed it a different way 
(by having the tserver cache the ids of recently deleted tablets). So, I'm 
thinking maybe we should just be reverting KUDU-941? [~mpercy] any thoughts 
here? It seems you did the patches for both of those previous issues, so maybe 
you remember why KUDU-941 wasn't sufficient, etc.

> Restarting a TS that had a lot of deleted tablets takes tens of minutes
> -----------------------------------------------------------------------
>
>                 Key: KUDU-1451
>                 URL: https://issues.apache.org/jira/browse/KUDU-1451
>             Project: Kudu
>          Issue Type: Bug
>          Components: tablet
>    Affects Versions: 0.8.0
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>
> Running a workload that deletes and creates a new table with hundreds of 
> tablets every few hours, I encounter this issue where if I haven't restarted 
> the cluster in a while it'll take tens of minutes to process all the deleted 
> tablets. The logs for a single tablet looks like:
> {noformat}
> I0511 19:03:52.380903 79512 ts_tablet_manager.cc:609] Loading metadata for 
> tablet 28f7ae54ac1d413b8a0e694e1dcef0fc
> I0511 19:03:52.391885 79512 ts_tablet_manager.cc:937] T 
> 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Tablet 
> Manager startup: Rolling forward tablet deletion of type TABLET_DATA_DELETED
> I0511 19:03:52.391894 79512 ts_tablet_manager.cc:964] T 
> 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Deleting 
> tablet data with delete state TABLET_DATA_DELETED
> I0511 19:03:52.497952 79512 ts_tablet_manager.cc:974] T 
> 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Tablet 
> deleted. Last logged OpId: 406.65248
> I0511 19:03:52.498010 79512 ts_tablet_manager.cc:946] T 
> 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Deleting 
> tablet superblock
> {noformat}
> In my latest instance of this problem, running on 
> 43c9c87604f3b6f3dd286c63344bf18a2db08c21, it took almost 20 minutes to 
> process 18k deleted tablets... then the TS can start bootstrapping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to