[jira] [Commented] (HBASE-1364) [performance] Distributed splitting of regionserver commit logs

stack (JIRA) Sat, 16 Apr 2011 13:46:48 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020657#comment-13020657
 ]


stack commented on HBASE-1364:
------------------------------

Latest patch looks good but still fails TestDistributedLogSplitting it seems.  
I'm on apache hbase trunk and apply your r6 patch from review board.  I get:

{code}
-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.master.TestDistributedLogSplitting
-------------------------------------------------------------------------------
Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 164.652 sec <<< 
FAILURE!
testWorkerAbort(org.apache.hadoop.hbase.master.TestDistributedLogSplitting)  
Time elapsed: 48.77 sec  <<< FAILURE!
java.lang.AssertionError: region server completed the split before aborting
        at org.junit.Assert.fail(Assert.java:91)
        at 
org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testWorkerAbort(TestDistributedLogSplitting.java:291)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
        at 
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62)
        at 
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140)
        at 
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165)
        at org.apache.maven.surefire.Surefire.run(Surefire.java:107)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at 
org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289)
        at 
org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005)
{code}

I'll attach the log.

If you want me to dig in too Prakash, just say and I'll have a go.  Otherwise 
defer to you since you know best whats in motion here.

Meantime I'm going to try it out on a cluster.  Good stuff Prakash!

> [performance] Distributed splitting of regionserver commit logs
> ---------------------------------------------------------------
>
>                 Key: HBASE-1364
>                 URL: https://issues.apache.org/jira/browse/HBASE-1364
>             Project: HBase
>          Issue Type: Improvement
>          Components: coprocessors
>            Reporter: stack
>            Assignee: Prakash Khemani
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: 1364-v5.txt, HBASE-1364.patch, 
> org.apache.hadoop.hbase.master.TestDistributedLogSplitting-output.txt
>
>          Time Spent: 8h
>  Remaining Estimate: 0h
>
> HBASE-1008 has some improvements to our log splitting on regionserver crash; 
> but it needs to run even faster.
> (Below is from HBASE-1008)
> In bigtable paper, the split is distributed. If we're going to have 1000 
> logs, we need to distribute or at least multithread the splitting.
> 1. As is, regions starting up expect to find one reconstruction log only. 
> Need to make it so pick up a bunch of edit logs and it should be fine that 
> logs are elsewhere in hdfs in an output directory written by all split 
> participants whether multithreaded or a mapreduce-like distributed process 
> (Lets write our distributed sort first as a MR so we learn whats involved; 
> distributed sort, as much as possible should use MR framework pieces). On 
> startup, regions go to this directory and pick up the files written by split 
> participants deleting and clearing the dir when all have been read in. Making 
> it so can take multiple logs for input, can also make the split process more 
> robust rather than current tenuous process which loses all edits if it 
> doesn't make it to the end without error.
> 2. Each column family rereads the reconstruction log to find its edits. Need 
> to fix that. Split can sort the edits by column family so store only reads 
> its edits.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1364) [performance] Distributed splitting of regionserver commit logs

Reply via email to