[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266307#comment-13266307
 ] 

jirapos...@reviews.apache.org commented on HBASE-5547:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4633/#review7458
-----------------------------------------------------------


Thanks for the review. Going to do a bit of investigation as to more 
commonality/interface hiding, but unfortunately we handle hfiles in both the RS 
and Master (and need to actuate from the client), so we need to touch a lot. 


src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveMonitor.java
<https://reviews.apache.org/r/4633/#comment16469>

    hfiles, tables are another patch :)



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveMonitor.java
<https://reviews.apache.org/r/4633/#comment16468>

    top-level as in o.a.h.hbase? Seems like a bit of clutter and this is just a 
specialization on that packaging and IMO doesn't preclude either master or rs 
packages from usage.
    
    But I can move it if there are strong feelings.



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTableTracker.java
<https://reviews.apache.org/r/4633/#comment16470>

    your right, fubbing my commenting a bit.



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTableTracker.java
<https://reviews.apache.org/r/4633/#comment16471>

    But its does tracking! And it doesn't actually actuate any changes on the 
rest of the cluster so Manager seems bad too. How about Monitor?



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTableTracker.java
<https://reviews.apache.org/r/4633/#comment16472>

    Started thinking about your later comment about a factory in HRS.java and 
think I'm going with the static factory method there. This still needs to be 
public for testing, but in general we will use the the 
ServerArchiveTableTracker (and I'll add docs to that effect as well).



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTableTracker.java
<https://reviews.apache.org/r/4633/#comment16473>

    It won't be used except for at startup, but seemed bad practice to 
prematurely read ZK before we get the start call on the parent. At the very 
least we can say that we are removing the previous tables though.



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTracker.java
<https://reviews.apache.org/r/4633/#comment16474>

    Should be fixed with the naming change above.



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTracker.java
<https://reviews.apache.org/r/4633/#comment16475>

    Yeah, its definitely a bit odd. Going with a factory here and only having 
the pass in option for testing exposure. See above (tabletracker) and below 
(hrs) for more info on the factory stuff.



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTracker.java
<https://reviews.apache.org/r/4633/#comment16476>

    done.



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTracker.java
<https://reviews.apache.org/r/4633/#comment16477>

    Yeah, its pretty gross. I've been thinking about the ways we can fix 
this/overhaul the zookeeper infrastructure.



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTracker.java
<https://reviews.apache.org/r/4633/#comment16478>

    This is actually a little nicer in that it doesn't preclude you from having 
a table named the same thing as the archive directory.
    
    Otherwise, you are checking the node vs. the node name of 
watcher.archiveHFileZNode, which could be the same. 



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTracker.java
<https://reviews.apache.org/r/4633/#comment16479>

    Done. Seemed to be the prevailing style, but this goes back to the fact 
that the watcher stuff if crufty.



src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTracker.java
<https://reviews.apache.org/r/4633/#comment16480>

    correct! Though anything in them would naturally be pb-ed :)



src/main/java/org/apache/hadoop/hbase/backup/ServerHFileTableArchiveTracker.java
<https://reviews.apache.org/r/4633/#comment16481>

    Inherit from tabletracker so we know when we should archive. However, seems 
like a bit cleaner to just let the subclasses just override registerTable() for 
testing.



src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
<https://reviews.apache.org/r/4633/#comment16482>

    what do you mean by backup? Its moving them to another directory, rather 
than being deleted. Sounds like backup to me :)



src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
<https://reviews.apache.org/r/4633/#comment16483>

    just throws a switch. should note its async, so you could have some extra 
backup files, but ZK is fast, so it shouldn' be that many.



src/main/java/org/apache/hadoop/hbase/client/HFileArchiveManager.java
<https://reviews.apache.org/r/4633/#comment16484>

    yup.



src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
<https://reviews.apache.org/r/4633/#comment16485>

    Only need one - the other is going to be unused in revised version.



src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
<https://reviews.apache.org/r/4633/#comment16486>

    Cruft naming - fixed.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
<https://reviews.apache.org/r/4633/#comment16487>

    cruft - just needs the tracker now. Paradigm seems to be to _not_ have that 
be an interface.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
<https://reviews.apache.org/r/4633/#comment16488>

    per-table hfile archiving tracker. So... both and neither? Name seems to be 
ok...maybe something like TableHFileArchiveTracker?



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
<https://reviews.apache.org/r/4633/#comment16490>

    master didn't, but didn't want to add all the zk dependencies into the CJ. 
And if the archive stuff gets more complex, it might need to be in the master 
(weak excuse, but still...)



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
<https://reviews.apache.org/r/4633/#comment16489>

    truth.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
<https://reviews.apache.org/r/4633/#comment16491>

    nope, and its a private field.



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<https://reviews.apache.org/r/4633/#comment16492>

    Not really - it only cares about if the HFiles should be archived or not, 
which is what the monitor is doing.



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<https://reviews.apache.org/r/4633/#comment16493>

    The monitor is what is keeping track of the archive/not logic, but yeah, 
this could be encapsulated in another class. Wanted to avoid all the 
redirection, since the code doesn't seem all that complex but moving it out 
could be reasonable.



src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
<https://reviews.apache.org/r/4633/#comment16494>

    see above.



src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
<https://reviews.apache.org/r/4633/#comment16495>

    much cleaned up.



src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
<https://reviews.apache.org/r/4633/#comment16466>

    I was actually initially considering using a factory here, but its 
definitely not. Rather, just creating the ServerHFileArchiveTracker with a 
"Server" instance that will give it the correct server name to register, so its 
kinda factory-ish, but there is only ever one instance (rather than a 
tabletracker creating class) and is super light weight, as opposed to all the 
double-checked locking, special creating etc that goes around using a factory 
in this situation. 
    
    The current impl seems to be the simplest, though I guess we could use an 
ArchiveTracker factory that takes a zk and Server; seems a bit overkill, though 
it does hide some of the impl details...



src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
<https://reviews.apache.org/r/4633/#comment16467>

    fat-fingered the save on HRS.java. Sorry for the cruft. I'll fix it in the 
next review.



src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
<https://reviews.apache.org/r/4633/#comment16496>

    No, this is for how the delete method is currently setup - takes a 
regionServerServices, which hides impl details, meaning we need to have this 
method. But its nice for testing too.



src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
<https://reviews.apache.org/r/4633/#comment16497>

    This ends up being cleaner - otherwise each region actually needs to keep 
around a reference to the ArchiveMonitor and then pass it into deleteRegion (in 
HRegion.merge). But if the predominant opinion is to keep that around instead, 
I can make the change. 



src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
<https://reviews.apache.org/r/4633/#comment16498>

    This is situation 2 where hfiles need to be archived. Tried to keep as much 
in common between the two situations, but there is limited commonality between 
the "Archive everything here" and "just archive these couple files" 
functionality.
    
    And either case, you need to know if you keep around the hfiles.
    
    I'll try to do some more combining, but we'll see how much more I can pull 
out.



src/main/java/org/apache/hadoop/hbase/util/HFileArchiveCleanup.java
<https://reviews.apache.org/r/4633/#comment16499>

    Nope, eclipse just didn't want to format this to wrap. Fixed. And moved 
into backup package.



src/main/java/org/apache/hadoop/hbase/util/HFileArchiveUtil.java
<https://reviews.apache.org/r/4633/#comment16500>

    Just for static methods. Makes it easier for the method interfaces, but 
I'll look into if simplifying makes more sense.



src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
<https://reviews.apache.org/r/4633/#comment16501>

    regionserverservices - yup. see above comment and see if you think 
differently.


- Jesse


On 2012-05-01 02:42:34, Jesse Yates wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4633/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-05-01 02:42:34)
bq.  
bq.  
bq.  Review request for hbase, Michael Stack and Lars Hofhansl.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Essentially, whenever an hfile would be deleted, it is instead moved to 
the archive directory. In this impl, the archive directory is on a per table 
basis, but defaults to '.archive'. Removing hfiles occurs in three places - 
compaction, merge and catalog janitor. The former and two latter are distinctly 
different code paths, but but did pull out some similarities. The latter two 
end up calling the same method, so there should be a reasonable amount of 
overlap.
bq.  
bq.  Implementation wise: 
bq.      Updated the HMasterInterface to pass the calls onto the zookeeper.
bq.      Added a zk listener to handle updates from the master to the RS to 
backup.
bq.      Added a utility for removing files and finding archive directories
bq.      Added tests for the regionserver and catalogjanitor approaches.
bq.      Added creation of manager in regionserver.
bq.  
bq.  
bq.  This addresses bug HBASE-5547.
bq.      https://issues.apache.org/jira/browse/HBASE-5547
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java a9d80a0 
bq.    src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveMonitor.java 
PRE-CREATION 
bq.    
src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTableTracker.java 
PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/backup/HFileArchiveTracker.java 
PRE-CREATION 
bq.    
src/main/java/org/apache/hadoop/hbase/backup/ServerHFileTableArchiveTracker.java
 PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java ee16e72 
bq.    src/main/java/org/apache/hadoop/hbase/client/HFileArchiveManager.java 
PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java 79d5fdd 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java d47b83a 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 7858846 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
61a5988 
bq.    
src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java 
6884d53 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 
ea12da4 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/Store.java bf1618e 
bq.    src/main/java/org/apache/hadoop/hbase/util/HFileArchiveCleanup.java 
PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/util/HFileArchiveUtil.java 
PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 
4fc105f 
bq.    src/main/resources/hbase-default.xml f54b345 
bq.    src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java 
a59e152 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java 
cedf31e 
bq.    
src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionHFileArchiving.java
 PRE-CREATION 
bq.    src/test/java/org/apache/hadoop/hbase/util/HFileArchiveTestingUtil.java 
PRE-CREATION 
bq.    src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java 
7d02759 
bq.    
src/test/java/org/apache/hadoop/hbase/util/TestHFileArchivingCleanup.java 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/4633/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Added two tests for the separate cases - archiving via the regionserver 
and for the catalog tracker. Former runs in a mini cluster and also touches the 
changes to HMasterInterface and zookeeper.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jesse
bq.  
bq.


                
> Don't delete HFiles when in "backup mode"
> -----------------------------------------
>
>                 Key: HBASE-5547
>                 URL: https://issues.apache.org/jira/browse/HBASE-5547
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Lars Hofhansl
>            Assignee: Jesse Yates
>         Attachments: java_HBASE-5547_v4.patch, java_HBASE-5547_v5.patch, 
> java_HBASE-5547_v6.patch
>
>
> This came up in a discussion I had with Stack.
> It would be nice if HBase could be notified that a backup is in progress (via 
> a znode for example) and in that case either:
> 1. rename HFiles to be delete to <file>.bck
> 2. rename the HFiles into a special directory
> 3. rename them to a general trash directory (which would not need to be tied 
> to backup mode).
> That way it should be able to get a consistent backup based on HFiles (HDFS 
> snapshots or hard links would be better options here, but we do not have 
> those).
> #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to