[jira] [Commented] (KUDU-2131) Tablet copy session may expire before completion for large tablet

Adar Dembo (JIRA) Sat, 02 Sep 2017 12:16:37 -0700

    [ 
https://issues.apache.org/jira/browse/KUDU-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151567#comment-16151567
 ]


Adar Dembo commented on KUDU-2131:
----------------------------------

Interesting. So although the FIFO container retrieval serves to "spread the 
load" across all available containers, that actually hurts us by increasing the 
total number of containers (and thus files) that need to be fsynced. The change 
you suggested seems like an easy fix; I don't see any reason why FIFO would be 
a _good_ thing here.

Just to be clear, your first test didn't trigger the issue, right? On a fresh 
tserver doing a copy I'd expect one non-full container per disk at a time. But 
I can definitely see how test #2 would trigger it.


> Tablet copy session may expire before completion for large tablet
> -----------------------------------------------------------------
>
>                 Key: KUDU-2131
>                 URL: https://issues.apache.org/jira/browse/KUDU-2131
>             Project: Kudu
>          Issue Type: Bug
>            Reporter: Hao Hao
>            Priority: Blocker
>             Fix For: 1.5.0
>
>
> KUDU-1726 introduced an optimization to do a bulk sync-to-disk once the 
> tablet copy operation is complete. However, when tested in a large cluster, I 
> found disk synchronization in batches can result in the tablet session 
> expires before the synchronization complete. There is a flag 
> '--tablet_copy_idle_timeout_ms' to control the amount of time without 
> activity before a tablet copy session expires, but it is tagged as hidden(not 
> user-facing).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (KUDU-2131) Tablet copy session may expire before completion for large tablet

Reply via email to