-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/179/#review216
-----------------------------------------------------------


Regarding rewriting recovered edits back to the WAL, I'm afraid we may have an 
issue with the following situation:
- RS A writes edits 1-100 to log
- RS A fails
- RS B starts recovering log, rewrites edits 1-50
- RS B fails while recovering
- RS C starts up. It *should* replay 1-100. Instead I fear it might only replay 
1-50, or replay 1-100 followed by 1-50, either of which would be incorrect.

We should add a test case where the RS fails in the middle of recovering a 
prior failure.


Also, replayRecoveredEdits returns the highest seqid from the logs. Is it 
possible for storeFiles to have a higher seqid than what's in the log in any 
situation? We should add some kind of assert perhaps.


src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment973>

    this returns -1 if there is nothing to replay. Shouldn't it return the 
maxSeqId + 1 perhaps?
    
    Do we have a test which makes a region, writes some data, closes the 
region, opens the region, writes some data, then recovers logs?



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment972>

    we used to print the sequence ID after the increment, now we print before 
the increment. Maybe change log to "sequenceid for next edit will be NNN"



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment975>

    add some javadoc - I think important to note that minSeqId is *exclusive* - 
it's a little unintuitive. Perhaps makes more sense to pass in 
maxSeqIdFromStoreFiles+1 here, and make minSeqId be "minimum sequence id, 
inclusive, to apply"



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment976>

    should probably use LOG.error for these



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment974>

    I agree, i don't think the splitter would ever write a partial one. We 
should treat an IOE here as a real error



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment977>

    when would this happen? deserves a comment I think



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment978>

    can we rename maxSeqIdInLog to currentEditSeqId or something? since that's 
what we're really talking about here. Also reversing this inequality to: if 
(currentEditSeqId < storeMaxSeqId>) I think it would read more clearly



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment979>

    I think this will crash when the region contains stores that were the 
result of an HFOF bulk load (they have no sequence id)



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment980>

    maybe rename to restoreEdit() or something to be clear that this is during 
restoration only?



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<http://review.hbase.org/r/179/#comment981>

    possibly more efficient:
    Collections.singletonMap(kv.getFamily(), Collections.singletonList(kv))
    
    though there may be a comparator issue?



src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
<http://review.hbase.org/r/179/#comment970>

    comment is now out of date, right?



src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
<http://review.hbase.org/r/179/#comment971>

    maybe just Preconditions.checkArgument here so it throws if you try to pass 
a non-DFS?


- Todd


On 2010-06-14 11:43:30, stack wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.hbase.org/r/179/
> -----------------------------------------------------------
> 
> (Updated 2010-06-14 11:43:30)
> 
> 
> Review request for hbase.
> 
> 
> Summary
> -------
> 
> + Moved replay of edits up from Store up into Region. This means we play the 
> edits once only rather than once per Store.
> + Lots of cleanup in the replay of edits code. Uses the new flag introduced 
> by cosmin – hbase.skip.errors. If set, and exception processing edits, we'll 
> fail. If false, we'll move the broke edits file aside and keep going. Renamed 
> the method from doReconstructionLog to replayRecoveredEdits (reconstruction?).
> + The main change in this patch is that recovering edits, we now go in via 
> the HRegion main API doing put and delete so that only one code path, so 
> replayed edits are added to the WAL, and so a flush will be triggered if we 
> fill memory.
> + In HStore, lots of removed code and comments since no longer does log 
> replay. Cleanup of maximum seqid. Calculate it instead rather than save as a 
> data member. Its only used once on HRegion startup.
> + Change the HRegion#initialize. It used to take 'initial files' which is a 
> notion never used (it was a means of putting files in place after a split but 
> split is done internal to HRegion so can do things in HRegions guts w/o need 
> of exposing notion of initial files). I removed it and added overload that 
> takes no args which is the usual way this method is invoked.
> + Rename the product of splits, 'recovered.edits' instead of 'oldlogfile.log'
> + Added small facility to HBaseTestingUtility for creating different user in 
> a Configuration so can have more than one Filesystem instance the easier.
> + Redid the test TestStoreReconstruction as TestWALReplay.
> 
> 
> This addresses bug hbase-1025.
>     http://issues.apache.org/jira/browse/hbase-1025
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/hadoop/hbase/HConstants.java f5d3e94 
>   src/main/java/org/apache/hadoop/hbase/HMerge.java 62f3561 
>   src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 06e022c 
>   src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
> bca819e 
>   src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 2a0dcee 
>   src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 05cf17f 
>   src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java dbb21d4 
>   src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 479c661 
>   src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java 43a8a28 
>   
> src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreReconstruction.java
>  4f0417d 
>   src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java 
> 86cf4ea 
>   src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java 
> PRE-CREATION 
> 
> Diff: http://review.hbase.org/r/179/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> stack
> 
>

Reply via email to