[
https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HADOOP-1662:
--------------------------
Status: Patch Available (was: In Progress)
Updated and reran local build. All passes. Resubmitting v3 of patch.
> [hbase] Make region splits faster
> ---------------------------------
>
> Key: HADOOP-1662
> URL: https://issues.apache.org/jira/browse/HADOOP-1662
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch,
> splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between
> memcache flushes' is about making compactions run faster. This issue is
> about making splits faster. Currently splits are done by reading as input a
> map file and per record, writing out two new mapfiles. Its currently too
> slow. ~30 seconds to split 120MB. Google hints in bigtable that splitting is
> very fast because they let the split children feed off the split parent.
> Primitive testing has splitting mapfiles using raw streams running 3 to 4
> times faster than splitting on mapfile keys.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.