[jira] [Commented] (KUDU-2014) Explore additional approaches to improve LBM startup time

Todd Lipcon (JIRA) Tue, 16 May 2017 11:36:28 -0700

    [ 
https://issues.apache.org/jira/browse/KUDU-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012887#comment-16012887
 ]


Todd Lipcon commented on KUDU-2014:
-----------------------------------

Another potentially easy win is to actually start more than one 
metadata-loading thread per disk. This seems to improve startup time by ~30%:

{code}
[root@vd0340 data]# echo 3 | sudo tee /proc/sys/vm/drop_caches 
3
[root@vd0340 data]# time ls *metadata | xargs -P20 -n1 cat > /dev/null

real    0m29.313s
user    0m0.124s
sys     0m0.607s
[root@vd0340 data]# echo 3 | sudo tee /proc/sys/vm/drop_caches 
3
[root@vd0340 data]# time ls *metadata | xargs  -n1 cat > /dev/null

real    0m42.676s
user    0m0.273s
sys     0m1.153s
{code}

(I guess the improvement is just getting more queue depth to the underlying 
disk)

> Explore additional approaches to improve LBM startup time
> ---------------------------------------------------------
>
>                 Key: KUDU-2014
>                 URL: https://issues.apache.org/jira/browse/KUDU-2014
>             Project: Kudu
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 1.4.0
>            Reporter: Adar Dembo
>              Labels: data-scalability
>
> The fix for KUDU-1549 added support for deleting full log block manager 
> containers with no live blocks, and for compacting container metadata to omit 
> CREATE/DELETE record pairs. Both of these will help reduce the amount of 
> metadata that must be read at startup. However, there's more we can do to 
> help; this JIRA captures some additional ideas worth exploring (if/when LBM 
> startup once again becomes intolerable):
> In [this 
> gerrit|https://gerrit.cloudera.org/#/c/6826/2/src/kudu/fs/log_block_manager.cc@90],
>  Todd made the case that container metadata processing is seek-dominant:
> {quote}
> looking at a data/ dir on a cluster that has been around for quite some time, 
> most of the metadata files seem to be around 400KB. Assuming 100MB/sec 
> sequential throughput and 10ms seek, it definitely seems like the startup 
> time would be seek-dominated (10 or 20ms seek depending whether various 
> internal metadata pages are hot in cache, plus only 4ms of sequential read 
> time). 
> {quote}
> We theorized several ways to reduce seeking, all focused on reducing the 
> number of discrete container metadata files read at startup:
> # Raise the container max data file size. This won't help on older versions 
> of el6 with ext4, but will help everywhere else. It makes sense for the max 
> data file size to be a function of the disk size anyway. And it's a pretty 
> cheap way to extract more scalability.
> # Reuse container data file holes, explicitly to avoid creating so many 
> containers. Perhaps with a round of "defragmentation" to simplify reuse, or 
> perhaps not. As a side effect, metadata file compaction now becomes more 
> important (and costly).
> # Eschew one metadata file per data file altogether and maintain just one 
> metadata file. Deleting "dead" containers would no longer be an improvement 
> for metadata startup cost. Metadata compaction would be a lot more expensive. 
> Block records themselves would be larger, because each record now needs to 
> point to a particular data file, though this can be mitigated in various 
> ways. A variant of this would be to do away with the 1-1 relationship between 
> metadata and data files and make it more like m-n.
> # Reduce the number of extents in container metadata files via judicious 
> preallocation.
> See the gerrit linked above for more details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KUDU-2014) Explore additional approaches to improve LBM startup time

Reply via email to