[ https://issues.apache.org/jira/browse/HBASE-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661571#comment-14661571 ]
chenxu commented on HBASE-3308: ------------------------------- when i read the patch, i find this line int nbFiles = hstoreFilesToSplit.size(); the hstoreFilesToSplit is a Map, not a List, if you want get all the StoreFiles, the code may be not right. > SplitTransaction.splitStoreFiles slows splits a lot > --------------------------------------------------- > > Key: HBASE-3308 > URL: https://issues.apache.org/jira/browse/HBASE-3308 > Project: HBase > Issue Type: Improvement > Reporter: Jean-Daniel Cryans > Assignee: Jean-Daniel Cryans > Priority: Critical > Fix For: 0.90.0, 0.92.0 > > Attachments: HBASE-3308-0.89.patch, HBASE-3308.patch > > > Recently I've been seeing some slow splits in our production environment > triggering timeouts, so I decided to take a closer look into the issue. > According to my debugging, we spend almost all the time it takes to split on > creating the reference files. Each file in my testing takes at least 300ms to > create, and averages around 600ms. Since we create two references per store > file, it means that a region with 4 store file can easily take up to 5 > seconds to split just to create those references. > An intuitive improvement would be to create those files in parallel, so at > least it wouldn't be much slower when we're splitting a higher number of > files. Stack left the following comment in the code: > {noformat} > // TODO: If the below were multithreaded would we complete steps in less > // elapsed time? St.Ack 20100920 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)