Have we considered that if we wanted to do incremental backups for particular table(s), we can just keep track of all the memstore flushes for those table(s) [and add some logic for bulk load as well] and have a cleaner that wont delete those storefiles from archive. Of course we would have to flush memstores for the time boundaries (or do a WAL roll) but that is only for the incremental boundaries.
That way the recovery process would be much faster and your incremental backup is truly only what you wrote for that day.....and if you wanted to delete an incremental backup (say only keep the last n backups around) we would just compact those together. Maybe this has already been discussed, if it has I'm sorry for bringing this up. On Tue, Jul 19, 2016 at 6:35 PM, Ted Yu <yuzhih...@gmail.com> wrote: > I have attached rebased mega patch to HBASE-14123 which is going through QA > run. > > I expect some findbugs / javadoc warnings which need to be addressed > before the merge. > > On Tue, Jul 19, 2016 at 6:21 PM, Enis Söztutar <e...@apache.org> wrote: > > > Thanks Matteo for chiming in. > > > > On Tue, Jul 19, 2016 at 5:02 PM, Matteo Bertozzi < > theo.berto...@gmail.com> > > wrote: > > > > > I did some review in the early beginning, but then lost track of the > > > changes. > > > but I'd like to give a quick review to the full code once people here > are > > > ok with getting this feature in master (2.0). > > > (let say we put a deadline for reviews, like 1 week for reviewing the > > full > > > stuff after everyone agrees to get this in. just to avoid holding this > > for > > > too long, but still enough time to have people that are interested to > > look > > > at it. with did the same thing for MOB with a mega patch > > > https://reviews.apache.org/r/36391/) > > > > > > > This sounds good. Vladimir / Ted how do you guys want to handle the > merge? > > As a giant patch or a rebase of code in the branch and through git merge. > > > > We need to run a vote when the to-be-merged branched is ready. We can > set a > > vote timeout for at least 1 week. > > > > > > > > > > most of the code seemed isolated from the beginning, few changes here > and > > > there in the core. > > > so, this side of things seems ok to me. > > > > > > maybe some work to add IT tests as mentioned above, but that should not > > > take long. > > > > > > I don't know if there are already docs, but that is another thing we > may > > > want to get in with the merge. > > > a minimal coverage at least on how to use the feature, and maybe > calling > > it > > > out as experimental? > > > > > > my main concern were around incremental backups. > > > I'm still not convinced around the fact that because the WALs contain > > > regions of multiple tables > > > the incremental backup will keep around WALs with some data that we > don't > > > really want in the backup (for space or maybe security reason). > > > > > > then there was the question about for how long should I take > > incrementals, > > > before deciding that a fresh full backup is less costly in terms of > > space. > > > but I think this incremental merge/compaction was a feature on the > > roadmap > > > as Phase3. > > > which I think is ok to get later on, > > > maybe just call out a lifecycle example on the docs under "best > > practices". > > > > > > > I think this will depend on the use case, and other factors like > bandwidth > > available, how much data > > the user is willing to lose in case of catastrophic failure and how > > "expensive" is full backup versus > > incremental one. > > > > The full backup should also be useable by default, so maybe we can make > an > > option to not even keep WAL files, and completely disable incremental > > backups? > > > > Enis > > > > > > > > > > has anyone interested in using backups looked at the doc in HBASE-7912? > > > is the current design of incremental backup acceptable for everyone > > wanting > > > to use this feature? > > > (maybe this should be a question for the @user list and not dev) > > > > > > is there anyone already using this feature or it is just dev testing > it? > > > to me will be interesting having a use-case/workflow example, > > > to see if in the real world my concerns about incremental are not > showing > > > up. > > > > > > On Tue, Jul 19, 2016 at 1:35 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > > > Gentle ping on this subject. > > > > > > > > The changes are mostly non-intrusive. > > > > > > > > More comments are welcome. > > > > > > > > On Mon, Jul 11, 2016 at 9:29 PM, Vladimir Rodionov < > > > vladrodio...@gmail.com > > > > > > > > > wrote: > > > > > > > > > Not that hard, Andrew. I will open JIRA. > > > > > > > > > > -Vlad > > > > > > > > > > On Mon, Jul 11, 2016 at 8:46 PM, Andrew Purtell < > > > > andrew.purt...@gmail.com> > > > > > wrote: > > > > > > > > > > > How hard would it be to convert what you've been using to test > end > > to > > > > end > > > > > > during dev into an IT? > > > > > > > > > > > > > > > > > > On Jul 11, 2016, at 5:31 PM, Vladimir Rodionov < > > > vladrodio...@gmail.com > > > > > > > > > > > wrote: > > > > > > > > > > > > >>> Is there an integration test in hbase-it yet? If not, any > tips > > > on a > > > > > > >>> semi-automateable way to take backups and restore them? > > > > > > > > > > > > > > We do not have yet, but we have a lot of unit tests. We > provide 2 > > > API > > > > > for > > > > > > > backup: > > > > > > > > > > > > > > 1. Admin.getBackupAdmin > > > > > > > > > > > > > > 2. Command - line via hbase command. > > > > > > > > > > > > > > Everything is straightforward. > > > > > > > > > > > > > > -Vlad > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> On Mon, Jul 11, 2016 at 5:23 PM, Dima Spivak < > > > dspi...@cloudera.com> > > > > > > wrote: > > > > > > >> > > > > > > >> Is there an integration test in hbase-it yet? If not, any tips > > on > > > a > > > > > > >> semi-automateable way to take backups and restore them? > > > > > > >> > > > > > > >> -Dima > > > > > > >> > > > > > > >> On Mon, Jul 11, 2016 at 6:42 PM, Vladimir Rodionov < > > > > > > vladrodio...@gmail.com > > > > > > >> wrote: > > > > > > >> > > > > > > >>> Sorry, wrong links: > > > > > > >>> These are the phases: > > > > > > >>> > > > > > > >>> Phase 1: > > > > > > >>> https://issues.apache.org/jira/browse/HBASE- > > > > > > >>> <https://issues.apache.org/jira/browse/HBASE-14030>14030 > > > > > > >>> Phase 2: > > > > > > >>> https://issues.apache.org/jira/browse/HBASE- > > > > > > >>> <https://issues.apache.org/jira/browse/HBASE-14123>14123 > > > > > > >>> Phase 3: > > > > > > >>> https://issues.apache.org/jira/browse/HBASE- > > > > > > >>> <https://issues.apache.org/jira/browse/HBASE-14414>14414 > > > > > > >>> > > > > > > >>> -Vlad > > > > > > >>> > > > > > > >>> On Mon, Jul 11, 2016 at 4:41 PM, Vladimir Rodionov < > > > > > > >> vladrodio...@gmail.com > > > > > > >>> wrote: > > > > > > >>> > > > > > > >>>> These are the phases: > > > > > > >>>> > > > > > > >>>> Phase 1: > > > > > > >>>> https://issues.apache.org/jira/browse/HBASE- > > > > > > >>>> <https://issues.apache.org/jira/browse/HBASE-7912>14030 > > > > > > >>>> Phase 2: > > > > > > >>>> https://issues.apache.org/jira/browse/HBASE- > > > > > > >>>> <https://issues.apache.org/jira/browse/HBASE-7912>14123 > > > > > > >>>> Phase 3: > > > > > > >>>> https://issues.apache.org/jira/browse/HBASE- > > > > > > >>>> <https://issues.apache.org/jira/browse/HBASE-7912>14414 > > > > > > >>>> > > > > > > >>>> -Vlad > > > > > > >>>> > > > > > > >>>> > > > > > > >>>> On Mon, Jul 11, 2016 at 12:21 PM, Enis Söztutar < > > > e...@apache.org> > > > > > > >> wrote: > > > > > > >>>> > > > > > > >>>>> As you guys may already be familiar, Vladimir, Ted, Jerry > and > > > > > others > > > > > > >>> have > > > > > > >>>>> been developing the backup / restore functionality in a > > series > > > of > > > > > > >> issues > > > > > > >>>>> committed in the separate branch HBASE-7912[1]. > > > > > > >>>>> > > > > > > >>>>> Backup / Restore functionality is tracked as a 4-phase > > project, > > > > and > > > > > > >> the > > > > > > >>>>> first two phases are complete and useable. We are now > working > > > on > > > > > > >> Phase 3 > > > > > > >>>>> items, which are mostly improvements. We think that the > > current > > > > > code > > > > > > >> in > > > > > > >>>>> the > > > > > > >>>>> branch containing all Phase 1 and Phase 2 items, and some > > > Phase 3 > > > > > > >> items > > > > > > >>> is > > > > > > >>>>> useable on it's own, and we do not have to wait for all the > > > > > > subtickets > > > > > > >>> to > > > > > > >>>>> be finished to make it completely useable (as follow up > > tickets > > > > are > > > > > > >>> mostly > > > > > > >>>>> improvements or optimizations). The improvements in the > works > > > are > > > > > all > > > > > > >>>>> backwards compatible with the existing stuff. Thus, we > would > > > like > > > > > to > > > > > > >>>>> propose that the branch HBASE-7912 be merged into master. > > The > > > > > parent > > > > > > >>> jira > > > > > > >>>>> has a design doc that goes into details about the > > > implementation > > > > > and > > > > > > >>>>> design > > > > > > >>>>> choices in case you are interested[2]. > > > > > > >>>>> > > > > > > >>>>> Most of the changes are largely non-intrusive and confined > to > > > the > > > > > > >>>>> backup subsystem. > > > > > > >>>>> The unit tests have been passing on manual runs and we > > > > > (hortonworks) > > > > > > >>> have > > > > > > >>>>> been running the integration tests as well as some other > > > > > shell-based > > > > > > >>>>> system > > > > > > >>>>> tests on a forked version of the code. Most of the work has > > > been > > > > > > >>> reviewed > > > > > > >>>>> by 1, 2 or 3 committers already (mostly Ted, myself and > > Jerry). > > > > > > >>>>> > > > > > > >>>>> What do you guys think? Is it time to call a vote? Any > > concerns > > > > or > > > > > > >>>>> feedback > > > > > > >>>>> appreciated. > > > > > > >>>>> > > > > > > >>>>> [1] https://issues.apache.org/jira/browse/HBASE-7912 > > > > > > >>>>> [2] > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/secure/attachment/12816339/HBaseBackupAndRestore%20-0.91.pdf > > > > > > >>>>> > > > > > > >>>>> Enis > > > > > > >> > > > > > > > > > > > > > > > > > > > > >