Nice writeup Matteo. I'm trying to get to where you are at but am not there yet. Almost. Agree that it should be a blocker that all gets hoisted out of core into a backup module and that the limitations are spelled out clearly in doc.
I have spent a good bit of time reviewing and testing this feature. I would like my review and concerns addressed and I'd like it to be clear how; either explicit follow-on issues, pointers to where in the patch or doc my remarks have been catered to, etc. Until then, I am against commit. St.Ack On Fri, Nov 18, 2016 at 3:48 PM, Matteo Bertozzi <theo.berto...@gmail.com> wrote: > I did one last pass to the mega patch. I don't see anything major that > should block the merge. > > - most of the code is isolated in the backup package > - all the backup code is client side > - there are few changes to the server side, mainly for cleaners, wal > rolling and similar (which is ok) > - there is a good number of tests, and an integration test > > the code seems to have still some left overs from the old implementation, > and some stuff needs a cleanup. but I don't think this should be used as an > argument to block the merge. I think the guys will keep working on this and > they may also get help of others once the patch is in master. > > I still have my concerns about the current limitations, but these are > things already planned for phase 3, so some of this stuff may even be in > the final 2.0. > but as long as we have a "current limitations" section in the user guide > mentioning important stuff like the ones below, I'm ok with it. > - if you write to the table with Durability.SKIP_WALS your data will not > be in the incremental-backup > - if you bulkload files that data will not be in the incremental backup > (HBASE-14417) > - the incremental backup will not only contains the data of the table you > specified but also the regions from other tables that are on the same set > of RSs (HBASE-14141) ...maybe a note about security around this topic > - the incremental backup will not contains just the "latest row" between > backup A and B, but it will also contains all the updates occurred in > between. but the restore does not allow you to restore up to a certain > point in time, the restore will always be up to the "latest backup point". > - you should limit the number of "incremental" up to N (or maybe SIZE), to > avoid replay time becoming the bottleneck. (HBASE-14135) > > I'll be ok even with the above not being in the final 2.0, > but i'd like to see as blocker for the final 2.0 (not the merge) > - the backup code moved in an hbase-backup module > - and some more work around tools, especially to try to unify and make > simple the backup experience (simple example: in some case there is a > backup_id argument in others a backupId argument. or things like.. restore > is not clear if given an incremental id it will do the full restore from > full up to that point or if i need to apply manually everything). > > in conclusion, I think we can open a merge vote. I'll be +1 on it, and I > think we should try to reject -1 with just a "code cleanup" motivation, > since there will still be work going on on the code after the merge. > > Matteo > > > On Sun, Nov 6, 2016 at 10:54 PM, Devaraj Das <d...@hortonworks.com> wrote: > > > Stack and others, anything else on the patch? Merge to master now? > > >