[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

stack (JIRA) Tue, 07 Aug 2007 19:40:25 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


stack updated HADOOP-1662:
--------------------------

    Attachment: splits-v3.patch

version 3

Improvements around recovery from catastrophic loss of ROOT and
META regions (Needed to make TestRegionServerAbort work reliably
after application of this splits patch).

M src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java
    Added logging of abort, close and wait.  Also on abort/close
    was doing a remove that made it so subsequent wait had nothing to
    wait on.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java
    Debug logging around split and edits.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    Added toString to each of the PendingOperation implementations.
    In the ShutdownPendingOperation scan of meta data, removed
    check of startcode (if the server name is that of the dead
    server, it needs reassigning even if start code is good).
    Also, if server name is null -- possible if we are missing
    edits off end of log -- then the region should be reassigned
    just in case its from the dead server.  Also, if reassigning,
    clear from pendingRegions.  Server may have died after sending
    region is up but before the server confirms receipt in the
    meta scan. Added mare detail to each log.  In OpenPendingOperation
    we were trying to clear pendingRegion in wrong place -- it was
    never executed (regions were always pending).
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Keying.java
    (intToBytes, longToBytes, getBytes, bytesToString, bytesToLong): Added.

> [hbase] Make region splits faster
> ---------------------------------
>
>                 Key: HADOOP-1662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1662
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, 
> splits-v3.patch
>
>
> HADOOP-1644 '[hbase] Compactions should take no longer than period between 
> memcache flushes' is about making compactions run faster.  This issue is 
> about making splits faster.  Currently splits are done by reading as input a 
> map file and per record, writing out two new mapfiles.  Its currently too 
> slow.  ~30 seconds to split 120MB. Google hints in bigtable that splitting is 
> very fast because they let the split children feed off the split parent.  
> Primitive testing has splitting mapfiles using raw streams running 3 to 4 
> times faster than splitting on mapfile keys.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1662) [hbase] Make region splits faster

Reply via email to