Any more comments from the community on whether the merge can be conducted ?
Thanks On Mon, Aug 1, 2016 at 12:03 PM, Vladimir Rodionov <vladrodio...@gmail.com> wrote: > Carter Shanklin posted a blog article about the feature: > Some use cases and examples of a command line interface usage. > > > https://hortonworks.com/blog/coming-hdp-2-5-incremental-backup-restore-apache-hbase-apache-phoenix/ > > -Vlad > > On Wed, Jul 20, 2016 at 1:25 PM, Vladimir Rodionov <vladrodio...@gmail.com > > > wrote: > > > Ok, got it. > > > > -Vlad > > > > On Wed, Jul 20, 2016 at 12:15 PM, Enis Söztutar <e...@apache.org> wrote: > > > >> We keep the WALs which can accumulate a lot if the use case is to only > do > >> backups infrequently. This will definitely cause issues since HDFS space > >> will get filled up. That is why we may need an option for having > >> incremental backups not used, and WAL references being deleted. > >> > >> Enis > >> > >> On Tue, Jul 19, 2016 at 6:33 PM, Vladimir Rodionov < > >> vladrodio...@gmail.com> > >> wrote: > >> > >> > Why anyone will ever need disabling incremental backups? If you do not > >> need > >> > it - just run only full backups. > >> > > >> > -Vlad > >> > > >> > On Tue, Jul 19, 2016 at 6:21 PM, Enis Söztutar <e...@apache.org> > wrote: > >> > > >> > > Thanks Matteo for chiming in. > >> > > > >> > > On Tue, Jul 19, 2016 at 5:02 PM, Matteo Bertozzi < > >> > theo.berto...@gmail.com> > >> > > wrote: > >> > > > >> > > > I did some review in the early beginning, but then lost track of > the > >> > > > changes. > >> > > > but I'd like to give a quick review to the full code once people > >> here > >> > are > >> > > > ok with getting this feature in master (2.0). > >> > > > (let say we put a deadline for reviews, like 1 week for reviewing > >> the > >> > > full > >> > > > stuff after everyone agrees to get this in. just to avoid holding > >> this > >> > > for > >> > > > too long, but still enough time to have people that are interested > >> to > >> > > look > >> > > > at it. with did the same thing for MOB with a mega patch > >> > > > https://reviews.apache.org/r/36391/) > >> > > > > >> > > > >> > > This sounds good. Vladimir / Ted how do you guys want to handle the > >> > merge? > >> > > As a giant patch or a rebase of code in the branch and through git > >> merge. > >> > > > >> > > We need to run a vote when the to-be-merged branched is ready. We > can > >> > set a > >> > > vote timeout for at least 1 week. > >> > > > >> > > > >> > > > > >> > > > most of the code seemed isolated from the beginning, few changes > >> here > >> > and > >> > > > there in the core. > >> > > > so, this side of things seems ok to me. > >> > > > > >> > > > maybe some work to add IT tests as mentioned above, but that > should > >> not > >> > > > take long. > >> > > > > >> > > > I don't know if there are already docs, but that is another thing > we > >> > may > >> > > > want to get in with the merge. > >> > > > a minimal coverage at least on how to use the feature, and maybe > >> > calling > >> > > it > >> > > > out as experimental? > >> > > > > >> > > > my main concern were around incremental backups. > >> > > > I'm still not convinced around the fact that because the WALs > >> contain > >> > > > regions of multiple tables > >> > > > the incremental backup will keep around WALs with some data that > we > >> > don't > >> > > > really want in the backup (for space or maybe security reason). > >> > > > > >> > > > then there was the question about for how long should I take > >> > > incrementals, > >> > > > before deciding that a fresh full backup is less costly in terms > of > >> > > space. > >> > > > but I think this incremental merge/compaction was a feature on the > >> > > roadmap > >> > > > as Phase3. > >> > > > which I think is ok to get later on, > >> > > > maybe just call out a lifecycle example on the docs under "best > >> > > practices". > >> > > > > >> > > > >> > > I think this will depend on the use case, and other factors like > >> > bandwidth > >> > > available, how much data > >> > > the user is willing to lose in case of catastrophic failure and how > >> > > "expensive" is full backup versus > >> > > incremental one. > >> > > > >> > > The full backup should also be useable by default, so maybe we can > >> make > >> > an > >> > > option to not even keep WAL files, and completely disable > incremental > >> > > backups? > >> > > > >> > > Enis > >> > > > >> > > > >> > > > > >> > > > has anyone interested in using backups looked at the doc in > >> HBASE-7912? > >> > > > is the current design of incremental backup acceptable for > everyone > >> > > wanting > >> > > > to use this feature? > >> > > > (maybe this should be a question for the @user list and not dev) > >> > > > > >> > > > is there anyone already using this feature or it is just dev > testing > >> > it? > >> > > > to me will be interesting having a use-case/workflow example, > >> > > > to see if in the real world my concerns about incremental are not > >> > showing > >> > > > up. > >> > > > > >> > > > On Tue, Jul 19, 2016 at 1:35 PM, Ted Yu <yuzhih...@gmail.com> > >> wrote: > >> > > > > >> > > > > Gentle ping on this subject. > >> > > > > > >> > > > > The changes are mostly non-intrusive. > >> > > > > > >> > > > > More comments are welcome. > >> > > > > > >> > > > > On Mon, Jul 11, 2016 at 9:29 PM, Vladimir Rodionov < > >> > > > vladrodio...@gmail.com > >> > > > > > > >> > > > > wrote: > >> > > > > > >> > > > > > Not that hard, Andrew. I will open JIRA. > >> > > > > > > >> > > > > > -Vlad > >> > > > > > > >> > > > > > On Mon, Jul 11, 2016 at 8:46 PM, Andrew Purtell < > >> > > > > andrew.purt...@gmail.com> > >> > > > > > wrote: > >> > > > > > > >> > > > > > > How hard would it be to convert what you've been using to > test > >> > end > >> > > to > >> > > > > end > >> > > > > > > during dev into an IT? > >> > > > > > > > >> > > > > > > > >> > > > > > > On Jul 11, 2016, at 5:31 PM, Vladimir Rodionov < > >> > > > vladrodio...@gmail.com > >> > > > > > > >> > > > > > > wrote: > >> > > > > > > > >> > > > > > > >>> Is there an integration test in hbase-it yet? If not, > any > >> > tips > >> > > > on a > >> > > > > > > >>> semi-automateable way to take backups and restore them? > >> > > > > > > > > >> > > > > > > > We do not have yet, but we have a lot of unit tests. We > >> > provide 2 > >> > > > API > >> > > > > > for > >> > > > > > > > backup: > >> > > > > > > > > >> > > > > > > > 1. Admin.getBackupAdmin > >> > > > > > > > > >> > > > > > > > 2. Command - line via hbase command. > >> > > > > > > > > >> > > > > > > > Everything is straightforward. > >> > > > > > > > > >> > > > > > > > -Vlad > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > >> On Mon, Jul 11, 2016 at 5:23 PM, Dima Spivak < > >> > > > dspi...@cloudera.com> > >> > > > > > > wrote: > >> > > > > > > >> > >> > > > > > > >> Is there an integration test in hbase-it yet? If not, any > >> tips > >> > > on > >> > > > a > >> > > > > > > >> semi-automateable way to take backups and restore them? > >> > > > > > > >> > >> > > > > > > >> -Dima > >> > > > > > > >> > >> > > > > > > >> On Mon, Jul 11, 2016 at 6:42 PM, Vladimir Rodionov < > >> > > > > > > vladrodio...@gmail.com > >> > > > > > > >> wrote: > >> > > > > > > >> > >> > > > > > > >>> Sorry, wrong links: > >> > > > > > > >>> These are the phases: > >> > > > > > > >>> > >> > > > > > > >>> Phase 1: > >> > > > > > > >>> https://issues.apache.org/jira/browse/HBASE- > >> > > > > > > >>> <https://issues.apache.org/jira/browse/HBASE-14030 > >14030 > >> > > > > > > >>> Phase 2: > >> > > > > > > >>> https://issues.apache.org/jira/browse/HBASE- > >> > > > > > > >>> <https://issues.apache.org/jira/browse/HBASE-14123 > >14123 > >> > > > > > > >>> Phase 3: > >> > > > > > > >>> https://issues.apache.org/jira/browse/HBASE- > >> > > > > > > >>> <https://issues.apache.org/jira/browse/HBASE-14414 > >14414 > >> > > > > > > >>> > >> > > > > > > >>> -Vlad > >> > > > > > > >>> > >> > > > > > > >>> On Mon, Jul 11, 2016 at 4:41 PM, Vladimir Rodionov < > >> > > > > > > >> vladrodio...@gmail.com > >> > > > > > > >>> wrote: > >> > > > > > > >>> > >> > > > > > > >>>> These are the phases: > >> > > > > > > >>>> > >> > > > > > > >>>> Phase 1: > >> > > > > > > >>>> https://issues.apache.org/jira/browse/HBASE- > >> > > > > > > >>>> <https://issues.apache.org/jira/browse/HBASE-7912 > >14030 > >> > > > > > > >>>> Phase 2: > >> > > > > > > >>>> https://issues.apache.org/jira/browse/HBASE- > >> > > > > > > >>>> <https://issues.apache.org/jira/browse/HBASE-7912 > >14123 > >> > > > > > > >>>> Phase 3: > >> > > > > > > >>>> https://issues.apache.org/jira/browse/HBASE- > >> > > > > > > >>>> <https://issues.apache.org/jira/browse/HBASE-7912 > >14414 > >> > > > > > > >>>> > >> > > > > > > >>>> -Vlad > >> > > > > > > >>>> > >> > > > > > > >>>> > >> > > > > > > >>>> On Mon, Jul 11, 2016 at 12:21 PM, Enis Söztutar < > >> > > > e...@apache.org> > >> > > > > > > >> wrote: > >> > > > > > > >>>> > >> > > > > > > >>>>> As you guys may already be familiar, Vladimir, Ted, > >> Jerry > >> > and > >> > > > > > others > >> > > > > > > >>> have > >> > > > > > > >>>>> been developing the backup / restore functionality in > a > >> > > series > >> > > > of > >> > > > > > > >> issues > >> > > > > > > >>>>> committed in the separate branch HBASE-7912[1]. > >> > > > > > > >>>>> > >> > > > > > > >>>>> Backup / Restore functionality is tracked as a 4-phase > >> > > project, > >> > > > > and > >> > > > > > > >> the > >> > > > > > > >>>>> first two phases are complete and useable. We are now > >> > working > >> > > > on > >> > > > > > > >> Phase 3 > >> > > > > > > >>>>> items, which are mostly improvements. We think that > the > >> > > current > >> > > > > > code > >> > > > > > > >> in > >> > > > > > > >>>>> the > >> > > > > > > >>>>> branch containing all Phase 1 and Phase 2 items, and > >> some > >> > > > Phase 3 > >> > > > > > > >> items > >> > > > > > > >>> is > >> > > > > > > >>>>> useable on it's own, and we do not have to wait for > all > >> the > >> > > > > > > subtickets > >> > > > > > > >>> to > >> > > > > > > >>>>> be finished to make it completely useable (as follow > up > >> > > tickets > >> > > > > are > >> > > > > > > >>> mostly > >> > > > > > > >>>>> improvements or optimizations). The improvements in > the > >> > works > >> > > > are > >> > > > > > all > >> > > > > > > >>>>> backwards compatible with the existing stuff. Thus, we > >> > would > >> > > > like > >> > > > > > to > >> > > > > > > >>>>> propose that the branch HBASE-7912 be merged into > >> master. > >> > > The > >> > > > > > parent > >> > > > > > > >>> jira > >> > > > > > > >>>>> has a design doc that goes into details about the > >> > > > implementation > >> > > > > > and > >> > > > > > > >>>>> design > >> > > > > > > >>>>> choices in case you are interested[2]. > >> > > > > > > >>>>> > >> > > > > > > >>>>> Most of the changes are largely non-intrusive and > >> confined > >> > to > >> > > > the > >> > > > > > > >>>>> backup subsystem. > >> > > > > > > >>>>> The unit tests have been passing on manual runs and we > >> > > > > > (hortonworks) > >> > > > > > > >>> have > >> > > > > > > >>>>> been running the integration tests as well as some > other > >> > > > > > shell-based > >> > > > > > > >>>>> system > >> > > > > > > >>>>> tests on a forked version of the code. Most of the > work > >> has > >> > > > been > >> > > > > > > >>> reviewed > >> > > > > > > >>>>> by 1, 2 or 3 committers already (mostly Ted, myself > and > >> > > Jerry). > >> > > > > > > >>>>> > >> > > > > > > >>>>> What do you guys think? Is it time to call a vote? Any > >> > > concerns > >> > > > > or > >> > > > > > > >>>>> feedback > >> > > > > > > >>>>> appreciated. > >> > > > > > > >>>>> > >> > > > > > > >>>>> [1] https://issues.apache.org/jira/browse/HBASE-7912 > >> > > > > > > >>>>> [2] > >> > > > > > > >> > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://issues.apache.org/jira/secure/attachment/12816339/HBaseBackupAndRestore%20-0.91.pdf > >> > > > > > > >>>>> > >> > > > > > > >>>>> Enis > >> > > > > > > >> > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > > > >