[jira] [Resolved] (HBASE-22749) Distributed MOB compactions

2020-04-18 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-22749.
---
Resolution: Fixed

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBASE-22749_nightly_Unit_Test_Results.csv, 
> HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24101) Correct snapshot handling

2020-04-18 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-24101.
---
Resolution: Not A Problem

> Correct snapshot handling
> -
>
> Key: HBASE-24101
> URL: https://issues.apache.org/jira/browse/HBASE-24101
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob, snapshots
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Critical
>
> Reopening this umbrella to address correct snapshot handling. Particularly, 
> the following scenario must be verified:
> # load data to a table
> # take snapshot
> # major compact table
> # run mob file cleaner chore
> # load data to table
> # restore table from snapshot into another table
> # verify data integrity
> # restore table from snapshot into original table
> # verify data integrity



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] New User Experience and Data Durability Guarantees on LocalFileSystem (HBASE-24086)

2020-04-15 Thread Vladimir Rodionov
This should work for locally attached storage for sure.

On Wed, Apr 15, 2020 at 3:52 PM Vladimir Rodionov 
wrote:

> FileOutputStream.getFileChannel().force(true) will get all durability we
> need. Just a simple code change?
>
>
> On Wed, Apr 15, 2020 at 12:32 PM Andrew Purtell 
> wrote:
>
>> This thread talks of “durability” via filesystem characteristics but also
>> for single system quick Start type deployments. For durability we need
>> multi server deployments. No amount of hacking a single system deployment
>> is going to give us durability as users will expect (“don’t lose my data”).
>> I believe my comments are on topic.
>>
>>
>> > On Apr 15, 2020, at 11:03 AM, Nick Dimiduk  wrote:
>> >
>> > On Wed, Apr 15, 2020 at 10:28 AM Andrew Purtell 
>> wrote:
>> >
>> >> Nick's mail doesn't make a distinction between avoiding data loss via
>> >> typical tmp cleaner configurations, unfortunately adjacent to mention
>> of
>> >> "durability", and real data durability, which implies more than what a
>> >> single system configuration can offer, no matter how many tweaks we
>> make to
>> >> LocalFileSystem. Maybe I'm being pedantic but this is something to be
>> >> really clear about IMHO.
>> >>
>> >
>> > I prefer to focus the attention of this thread to the question of data
>> > durability via `FileSystem` characteristics. I agree that there are
>> > concerns of durability (and others) around the use of the path under
>> /tmp.
>> > Let's keep that discussion in the other thread.
>> >
>> >> On Wed, Apr 15, 2020 at 10:05 AM Sean Busbey 
>> wrote:
>> >>
>> >>> I think the first assumption no longer holds. Especially with the move
>> >>> to flexible compute environments I regularly get asked by folks what
>> >>> the smallest HBase they can start with for production. I can keep
>> >>> saying 3/5/7 nodes or whatever but I guarantee there are folks who
>> >>> want to and will run HBase with a single node. Probably those
>> >>> deployments won't want to have the distributed flag set. None of them
>> >>> really have a good option for where the WALs go, and failing loud when
>> >>> they try to go to LocalFileSystem is the best option I've seen so far
>> >>> to make sure folks realize they are getting into muddy waters.
>> >>>
>> >>> I agree with the second assumption. Our quickstart in general is too
>> >>> complicated. Maybe if we include big warnings in the guide itself, we
>> >>> could make a quickstart specific artifact to download that has the
>> >>> unsafe disabling config in place?
>> >>>
>> >>> Last fall I toyed with the idea of adding an "hbase-local" module to
>> >>> the hbase-filesystem repo that could start us out with some
>> >>> optimizations for single node set ups. We could start with a fork of
>> >>> RawLocalFileSystem (which will call OutputStream flush operations in
>> >>> response to hflush/hsync) that properly advertises its
>> >>> StreamCapabilities to say that it supports the operations we need.
>> >>> Alternatively we could make our own implementation of FileSystem that
>> >>> uses NIO stuff. Either of these approaches would solve both problems.
>> >>>
>> >>> On Wed, Apr 15, 2020 at 11:40 AM Nick Dimiduk 
>> >> wrote:
>> >>>>
>> >>>> Hi folks,
>> >>>>
>> >>>> I'd like to bring up the topic of the experience of new users as it
>> >>>> pertains to use of the `LocalFileSystem` and its associated (lack of)
>> >>> data
>> >>>> durability guarantees. By default, an unconfigured HBase runs with
>> its
>> >>> root
>> >>>> directory on a `file:///` path. This patch is picked up as an
>> instance
>> >> of
>> >>>> `LocalFileSystem`. Hadoop has long offered this class, but it has
>> never
>> >>>> supported `hsync` or `hflush` stream characteristics. Thus, when
>> HBase
>> >>> runs
>> >>>> on this configuration, it is unable to ensure that WAL writes are
>> >>> durable,
>> >>>> and thus will ACK a write without this assurance. This is the case,
>> >> even
>> >>>> when running in a fully durable WAL mode.
&g

Re: [DISCUSS] New User Experience and Data Durability Guarantees on LocalFileSystem (HBASE-24086)

2020-04-15 Thread Vladimir Rodionov
FileOutputStream.getFileChannel().force(true) will get all durability we
need. Just a simple code change?


On Wed, Apr 15, 2020 at 12:32 PM Andrew Purtell 
wrote:

> This thread talks of “durability” via filesystem characteristics but also
> for single system quick Start type deployments. For durability we need
> multi server deployments. No amount of hacking a single system deployment
> is going to give us durability as users will expect (“don’t lose my data”).
> I believe my comments are on topic.
>
>
> > On Apr 15, 2020, at 11:03 AM, Nick Dimiduk  wrote:
> >
> > On Wed, Apr 15, 2020 at 10:28 AM Andrew Purtell 
> wrote:
> >
> >> Nick's mail doesn't make a distinction between avoiding data loss via
> >> typical tmp cleaner configurations, unfortunately adjacent to mention of
> >> "durability", and real data durability, which implies more than what a
> >> single system configuration can offer, no matter how many tweaks we
> make to
> >> LocalFileSystem. Maybe I'm being pedantic but this is something to be
> >> really clear about IMHO.
> >>
> >
> > I prefer to focus the attention of this thread to the question of data
> > durability via `FileSystem` characteristics. I agree that there are
> > concerns of durability (and others) around the use of the path under
> /tmp.
> > Let's keep that discussion in the other thread.
> >
> >> On Wed, Apr 15, 2020 at 10:05 AM Sean Busbey  wrote:
> >>
> >>> I think the first assumption no longer holds. Especially with the move
> >>> to flexible compute environments I regularly get asked by folks what
> >>> the smallest HBase they can start with for production. I can keep
> >>> saying 3/5/7 nodes or whatever but I guarantee there are folks who
> >>> want to and will run HBase with a single node. Probably those
> >>> deployments won't want to have the distributed flag set. None of them
> >>> really have a good option for where the WALs go, and failing loud when
> >>> they try to go to LocalFileSystem is the best option I've seen so far
> >>> to make sure folks realize they are getting into muddy waters.
> >>>
> >>> I agree with the second assumption. Our quickstart in general is too
> >>> complicated. Maybe if we include big warnings in the guide itself, we
> >>> could make a quickstart specific artifact to download that has the
> >>> unsafe disabling config in place?
> >>>
> >>> Last fall I toyed with the idea of adding an "hbase-local" module to
> >>> the hbase-filesystem repo that could start us out with some
> >>> optimizations for single node set ups. We could start with a fork of
> >>> RawLocalFileSystem (which will call OutputStream flush operations in
> >>> response to hflush/hsync) that properly advertises its
> >>> StreamCapabilities to say that it supports the operations we need.
> >>> Alternatively we could make our own implementation of FileSystem that
> >>> uses NIO stuff. Either of these approaches would solve both problems.
> >>>
> >>> On Wed, Apr 15, 2020 at 11:40 AM Nick Dimiduk 
> >> wrote:
> 
>  Hi folks,
> 
>  I'd like to bring up the topic of the experience of new users as it
>  pertains to use of the `LocalFileSystem` and its associated (lack of)
> >>> data
>  durability guarantees. By default, an unconfigured HBase runs with its
> >>> root
>  directory on a `file:///` path. This patch is picked up as an instance
> >> of
>  `LocalFileSystem`. Hadoop has long offered this class, but it has
> never
>  supported `hsync` or `hflush` stream characteristics. Thus, when HBase
> >>> runs
>  on this configuration, it is unable to ensure that WAL writes are
> >>> durable,
>  and thus will ACK a write without this assurance. This is the case,
> >> even
>  when running in a fully durable WAL mode.
> 
>  This impacts a new user, someone kicking the tires on HBase following
> >> our
>  Getting Started docs. On Hadoop 2.8 and before, an unconfigured HBase
> >>> will
>  WARN and cary on. Hadoop 2.10+, HBase will refuse to start. The book
>  describes a process of disabling enforcement of stream capability
>  enforcement as a first step. This is a mandatory configuration for
> >>> running
>  HBase directly out of our binary distribution.
> 
>  HBASE-24086 restores the behavior on Hadoop 2.10+ to that of running
> on
>  2.8: log a warning and cary on. The critique of this approach is that
> >>> it's
>  far too subtle, too quiet for a system operating in a state known to
> >> not
>  provide data durability.
> 
>  I have two assumptions/concerns around the state of things, which
> >>> prompted
>  my solution on HBASE-24086 and the associated doc update on
> >> HBASE-24106.
> 
>  1. No one should be running a production system on `LocalFileSystem`.
> 
>  The initial implementation checked both for `LocalFileSystem` and
>  `hbase.cluster.distributed`. When running on the former and the latter
> >> is
>  false, we assume the user is running a 

Re: [DISCUSS] Arrange Events for 10-year Anniversary

2020-04-15 Thread Vladimir Rodionov
2020 - 10 = 2010. As far as I remember I joined HBase community in 2009 :)
and I am pretty sure that Mr. Stack did it even earlier.

Best regards,
Vlad

On Wed, Apr 15, 2020 at 5:57 AM Yu Li  wrote:

> Dear all,
>
> Since our project has reached its 10th birthday, and 10 years is definitely
> a great milestone, I propose to arrange some special (virtual) events for
> celebration. What comes into my mind include:
>
> * Open threads to collect voices from our dev/user mailing list, like "what
> do you want to say to HBase for its 10th birthday" (as well as our twitter
> accounts maybe, if any)
>
> * Arrange some online interviews to both PMC members and our customers.
> Some of us have been in this project all the way and there must be some
> good stories to tell, as well as expectations for the future.
>
> * Join the Apache Feathercast as suggested in another thread.
>
> * Form a blogpost to include all above events as an official celebration.
>
> What do you think? Any other good ideas? Looking forward to more voices
> (smile).
>
> Best Regards,
> Yu
>


[jira] [Created] (HBASE-24101) Correct snapshot handling

2020-04-01 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-24101:
-

 Summary: Correct snapshot handling
 Key: HBASE-24101
 URL: https://issues.apache.org/jira/browse/HBASE-24101
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-22749) Distributed MOB compactions

2020-04-01 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov reopened HBASE-22749:
---

Reopening this umbrella to address correct snapshot handling 

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBASE-22749_nightly_Unit_Test_Results.csv, 
> HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23363) MobCompactionChore takes a long time to complete once job

2020-03-05 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23363.
---
Resolution: Won't Fix

HBASE-22749 has introduced distributed MOB compaction, which significantly 
improves performance. Distributed MOB compaction will be back-ported to 2.x 
branches soon. 

> MobCompactionChore takes a long time to complete once job
> -
>
> Key: HBASE-23363
> URL: https://issues.apache.org/jira/browse/HBASE-23363
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.1, 2.2.2
>Reporter: Bo Cui
>Priority: Major
> Attachments: image-2019-12-04-11-01-20-352.png
>
>
> mob table compcation is done in master
>  poolSize of hbase choreService is 1
>  if hbase has 1000 mob table,MobCompactionChore takes a long time to complete 
> once job, other chore need to wait
> !image-2019-12-04-11-01-20-352.png!
> {code:java}
> MobCompactionChore#chore() {
>...
>for (TableDescriptor htd : map.values()) {
>   ...
>   for (ColumnFamilyDescriptor hcd : htd.getColumnFamilies()) {
>  if hcd is mob{
> MobUtils.doMobCompaction;
>  }
>   }
>   ...
>}
>...
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23840) Revert optimized IO back to general compaction during upgrade/migration process

2020-02-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23840.
---
Resolution: Fixed

> Revert optimized IO back to general compaction during upgrade/migration 
> process 
> 
>
> Key: HBASE-23840
> URL: https://issues.apache.org/jira/browse/HBASE-23840
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
>
> Optimized mode IO compaction may leave old MOB file, which size is above 
> threshold as is and don't compact it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23840) Revert optimized IO backt to general compaction during upgrade/migration process

2020-02-14 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23840:
-

 Summary: Revert optimized IO backt to general compaction during 
upgrade/migration process 
 Key: HBASE-23840
 URL: https://issues.apache.org/jira/browse/HBASE-23840
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Optimized mode IO compaction may leave old MOB file, which size is above 
threshold as is and don't compact it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23724) Change code in StoreFileInfo to use regex matcher for mob files.

2020-01-22 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23724:
-

 Summary: Change code in StoreFileInfo to use regex matcher for mob 
files.
 Key: HBASE-23724
 URL: https://issues.apache.org/jira/browse/HBASE-23724
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Currently it sits on top of other regex with additional logic added. Code 
should simplified.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23723) Add tests for MOB compaction on a table created from snapshot

2020-01-22 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23723:
-

 Summary: Add tests for MOB compaction on a table created from 
snapshot
 Key: HBASE-23723
 URL: https://issues.apache.org/jira/browse/HBASE-23723
 Project: HBase
  Issue Type: Sub-task
 Environment: How does code  handle snapshot naming convention for MOB 
files.
Reporter: Vladimir Rodionov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23571) Handle CompactType.MOB correctly

2019-12-12 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23571:
-

 Summary: Handle CompactType.MOB correctly
 Key: HBASE-23571
 URL: https://issues.apache.org/jira/browse/HBASE-23571
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Client facing feature, should be supported or at least properly handled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23189) Finalize I/O optimized MOB compaction

2019-11-22 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23189.
---
Resolution: Fixed

> Finalize I/O optimized MOB compaction
> -
>
> Key: HBASE-23189
> URL: https://issues.apache.org/jira/browse/HBASE-23189
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
>
> +corresponding test cases
> The current code for I/O optimized compaction has not been tested and 
> verified yet. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23267) Test case for MOB compaction in a regular mode.

2019-11-06 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23267.
---
Resolution: Fixed

Resolved. Pushed to the parent's PR branch.

> Test case for MOB compaction in a regular mode.
> ---
>
> Key: HBASE-23267
> URL: https://issues.apache.org/jira/browse/HBASE-23267
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
>
> We need this test case too. 
> Test case description (similar to HBASE-23266):
> {code}
> /**
>   * Mob file compaction chore in default regular mode test.
>   * 1. Enables non-batch mode (default) for regular MOB compaction, 
>   *Sets batch size to 7 regions.
>   * 2. Disables periodic MOB compactions, sets minimum age to archive to 10 
> sec   
>   * 3. Creates MOB table with 20 regions
>   * 4. Loads MOB data (randomized keys, 1000 rows), flushes data.
>   * 5. Repeats 4. two more times
>   * 6. Verifies that we have 20 *3 = 60 mob files (equals to number of 
> regions x 3)
>   * 7. Runs major MOB compaction.
>   * 8. Verifies that number of MOB files in a mob directory is 20 x4 = 80
>   * 9. Waits for a period of time larger than minimum age to archive 
>   * 10. Runs Mob cleaner chore
>   * 11 Verifies that number of MOB files in a mob directory is 20.
>   * 12 Runs scanner and checks all 3 * 1000 rows.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23267) Test case for MOB compaction in a regular mode.

2019-11-06 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23267:
-

 Summary: Test case for MOB compaction in a regular mode.
 Key: HBASE-23267
 URL: https://issues.apache.org/jira/browse/HBASE-23267
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


We need this test case too. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23266) Test case for MOB compaction in a region's batch mode.

2019-11-06 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23266.
---
Resolution: Fixed

Resolved. Pushed change to parent's PR branch.

> Test case for MOB compaction in a region's batch mode.
> --
>
> Key: HBASE-23266
> URL: https://issues.apache.org/jira/browse/HBASE-23266
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
>
> Major MOB compaction in a general (non-generational) mode can be run in a 
> batched mode (disabled by default). In this mode, only subset of regions at a 
> time are compacted to mitigate possible compaction storms. We need test case 
> for this mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23266) Test case for MOB compaction in a region's batch mode.

2019-11-06 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23266:
-

 Summary: Test case for MOB compaction in a region's batch mode.
 Key: HBASE-23266
 URL: https://issues.apache.org/jira/browse/HBASE-23266
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Major MOB compaction in a general (non-generational) mode can be run in a 
batched mode (disabled by default). In this mode, only subset of regions at a 
time are compacted to mitigate possible compaction storms. We need test case 
for this mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23188) MobFileCleanerChore test case

2019-11-05 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23188.
---
Resolution: Fixed

Resolved. Pushed to parent PR branch.

> MobFileCleanerChore test case
> -
>
> Key: HBASE-23188
> URL: https://issues.apache.org/jira/browse/HBASE-23188
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Vladimir Rodionov
>Priority: Major
>
> The test should do the following:
> a) properly remove obsolete files as expected
> b) dot not remove mob files from prior to the reference accounting added in 
> this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23190) Convert MobCompactionTest into integration test

2019-11-05 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23190.
---
Resolution: Fixed

Resolved in a last parent PR commit (11/5). 

> Convert MobCompactionTest into integration test
> ---
>
> Key: HBASE-23190
> URL: https://issues.apache.org/jira/browse/HBASE-23190
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Vladimir Rodionov
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23209) Simplify logic in DefaultMobStoreCompactor

2019-10-31 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23209.
---
Resolution: Fixed

> Simplify logic in DefaultMobStoreCompactor
> --
>
> Key: HBASE-23209
> URL: https://issues.apache.org/jira/browse/HBASE-23209
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
>
> The major compaction loop is quite large and has many branches, especially in 
> a non-MOB mode. Consider moving MOB data only Ain a MOB compaction mode and 
> simplify non-MOB case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23209) Simplify logic in DefaultMobStoreCompactor

2019-10-23 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23209:
-

 Summary: Simplify logic in DefaultMobStoreCompactor
 Key: HBASE-23209
 URL: https://issues.apache.org/jira/browse/HBASE-23209
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


The major compaction loop is quite large and has many branches, especially in a 
non-MOB mode. Consider moving MOB data only Ain a MOB compaction mode and 
simplify non-MOB case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23198) Documentation and release notes

2019-10-21 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23198:
-

 Summary: Documentation and release notes
 Key: HBASE-23198
 URL: https://issues.apache.org/jira/browse/HBASE-23198
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Document all the changes: algorithms, new configuration options, obsolete 
configurations, upgrade procedure and possibility of downgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23190) Convert MobCompactionTest into integration test

2019-10-18 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23190:
-

 Summary: Convert MobCompactionTest into integration test
 Key: HBASE-23190
 URL: https://issues.apache.org/jira/browse/HBASE-23190
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23189) Finalize generational compaction

2019-10-18 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23189:
-

 Summary: Finalize generational compaction
 Key: HBASE-23189
 URL: https://issues.apache.org/jira/browse/HBASE-23189
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


+corresponding test cases

The current code for generational compaction has not been tested and verified 
yet. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23188) MobFileCleanerChore test case

2019-10-18 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23188:
-

 Summary: MobFileCleanerChore test case
 Key: HBASE-23188
 URL: https://issues.apache.org/jira/browse/HBASE-23188
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


The test should do the following:
a) properly remove obsolete files as expected
b) dot not remove mob files from prior to the reference accounting added in 
this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Limiting gitbox PR email

2019-09-28 Thread Vladimir Rodionov
No, good idea.

-Vlad

On Sat, Sep 28, 2019 at 10:17 AM Nick Dimiduk  wrote:

> Heya,
>
> I would like our dev@ subscription to gitbox notifications to match that
> of
> our JIRA notifications — just open, close of issues. Right now we’re
> getting every comment. Like JIRA, users are able to “watch” individual PRs
> that they find to be of interest.
>
> Any objections if I engage infra on making this change?
>
> Thanks,
> Nick
>


[jira] [Resolved] (HBASE-22826) Wrong FS: recovered.edits goes to wrong file system

2019-08-09 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-22826.
---
Resolution: Won't Fix

> Wrong FS: recovered.edits goes to wrong file system
> ---
>
> Key: HBASE-22826
> URL: https://issues.apache.org/jira/browse/HBASE-22826
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.5
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
>
> When WAL is attached to a separate file system, recovered.edits are going to 
> hbase root directory.
> PROBLEM
> * Customer environment
> HBase root directory : On WASB
> hbase.wal.dir : On HDFS
> Customer is creating and HBase table and running VIEW DDL on top of the Hbase 
> table. The recovered.edits are going to hbase root directory in WASB and 
> region assignments getting failed.
> Customer is on HBase 2.0.4. 
> {code:java}if (RegionReplicaUtil.isDefaultReplica(getRegionInfo())) {
>   LOG.debug("writing seq id for {}", 
> this.getRegionInfo().getEncodedName());
>   WALSplitter.writeRegionSequenceIdFile(fs.getFileSystem(), 
> getWALRegionDir(), nextSeqId);
>   //WALSplitter.writeRegionSequenceIdFile(getWalFileSystem(), 
> getWALRegionDir(), nextSeqId - 1);{code}
> {code:java}2019-08-05 22:07:31,940 ERROR 
> [RS_OPEN_META-regionserver/c47-node3:16020-0] handler.OpenRegionHandler: 
> Failed open of region=hbase:meta,,1.1588230740
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://c47-node2.squadron-labs.com:8020/hbasewal/hbase/meta/1588230740/recovered.edits,
>  expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:730)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:460)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678)
> at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
> at 
> org.apache.hadoop.hbase.wal.WALSplitter.getSequenceIdFiles(WALSplitter.java:647)
> at 
> org.apache.hadoop.hbase.wal.WALSplitter.writeRegionSequenceIdFile(WALSplitter.java:680)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:984)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:881)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7149)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7108)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7080)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7038)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6989)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:283)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HBASE-22826) Wrong FS: recovered.edits goes to wrong file system

2019-08-09 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-22826:
-

 Summary: Wrong FS: recovered.edits goes to wrong file system
 Key: HBASE-22826
 URL: https://issues.apache.org/jira/browse/HBASE-22826
 Project: HBase
  Issue Type: New Feature
Affects Versions: 2.0.5
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


When WAL is attached to a separate file system, recovered.edits are going to 
hbase root directory.
PROBLEM

* Customer environment
HBase root directory : On WASB
hbase.wal.dir : On HDFS

Customer is creating and HBase table and running VIEW DDL on top of the Hbase 
table. The recovered.edits are going to hbase root directory in WASB and region 
assignments getting failed.
Customer is on HBase 2.0.4. 


{code:java}if (RegionReplicaUtil.isDefaultReplica(getRegionInfo())) {
  LOG.debug("writing seq id for {}", this.getRegionInfo().getEncodedName());
  WALSplitter.writeRegionSequenceIdFile(fs.getFileSystem(), 
getWALRegionDir(), nextSeqId);
  //WALSplitter.writeRegionSequenceIdFile(getWalFileSystem(), 
getWALRegionDir(), nextSeqId - 1);{code}


{code:java}2019-08-05 22:07:31,940 ERROR 
[RS_OPEN_META-regionserver/c47-node3:16020-0] handler.OpenRegionHandler: Failed 
open of region=hbase:meta,,1.1588230740
java.lang.IllegalArgumentException: Wrong FS: 
hdfs://c47-node2.squadron-labs.com:8020/hbasewal/hbase/meta/1588230740/recovered.edits,
 expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:730)
at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
at 
org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:460)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
at 
org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678)
at 
org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
at 
org.apache.hadoop.hbase.wal.WALSplitter.getSequenceIdFiles(WALSplitter.java:647)
at 
org.apache.hadoop.hbase.wal.WALSplitter.writeRegionSequenceIdFile(WALSplitter.java:680)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:984)
at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:881)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7149)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7108)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7080)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7038)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6989)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:283)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}





--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HBASE-22749) HBase MOB 2.0

2019-07-26 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-22749:
-

 Summary: HBase MOB 2.0
 Key: HBASE-22749
 URL: https://issues.apache.org/jira/browse/HBASE-22749
 Project: HBase
  Issue Type: New Feature
  Components: mob
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


There are several  drawbacks in the original MOB 1.0  (Moderate Object Storage) 
implementation, which can limit the adoption of the MOB feature:  

# MOB compactions are executed in a Master as a chore, which limits scalability 
because all I/O goes through a single HBase Master server. 
# Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
way, but this won’t work in a stand-alone HBase cluster.
# Two separate compactors for MOB and for regular store files and their 
interactions can result in a data loss (see HBASE-22075)

The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
implementation, which is free of the above drawbacks and can be used as a drop 
in replacement in existing MOB deployments. So, these are design goals of a MOB 
2.0:

# Make MOB compactions scalable without relying on Yarn/Mapreduce framework
# Provide unified compactor for both MOB and regular store files
# Make it more robust especially w.r.t. to data losses. 
# Simplify and reduce the overall MOB code.
# Provide 100% compatible implementation with MOB 1.0.
# No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
software upgrade.




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: Can we remove OfflineMetaRepair in the master branch and branch-2+?

2019-07-11 Thread Vladimir Rodionov
+1. It has always been obscure tool.

On Wed, Jul 10, 2019 at 8:47 PM Toshihiro Suzuki 
wrote:

> Hi folks!
>
> I think we no longer support OfflineMetaRepair in HBase-2.x and it has a
> critical bug that breaks the meta data:
> https://issues.apache.org/jira/browse/HBASE-21665
>
> Actually, I have seen several cases where some users ran OfflineMetaRepair
> mistakenly in their cluster and corrupted their meta data.
>
> Also, new meta rebuilding tool is being developed by Wellington at the
> moment in HBCK2:
> https://issues.apache.org/jira/browse/HBASE-22567
>
> So I think we can remove OfflineMetaRepair in the master branch and
> branch-2+. What do you think?
>
> If no objections, I'll file this in Apache Jira.
>
> Regards,
> Toshi
>


[jira] [Resolved] (HBASE-22205) Backport HBASE-21688 to 1.3+ branches

2019-04-10 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-22205.
---
Resolution: Not A Problem

> Backport HBASE-21688 to 1.3+ branches
> -
>
> Key: HBASE-22205
> URL: https://issues.apache.org/jira/browse/HBASE-22205
> Project: HBase
>  Issue Type: Bug
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
>
> We need to address WAL file system issues in 1.x as well (those branches, 
> which support separate file system for WAL - 1.3+)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22205) Backport HBASE-21688 to 1.3+ branches

2019-04-10 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-22205:
-

 Summary: Backport HBASE-21688 to 1.3+ branches
 Key: HBASE-22205
 URL: https://issues.apache.org/jira/browse/HBASE-22205
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


We need to address WAL file system issues in 1.x as well (those branches, which 
support separate file system for WAL - 1.3+)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: GetAndPut

2019-03-25 Thread Vladimir Rodionov
Interesting. If CheckAndPut succeeds, then you know the value and no need
for Get, right?
Only if it fail, you want to know current value if CheckAndPut fails?
Can you elaborate on your use case, Jean-Marc?

-Vlad

On Mon, Mar 25, 2019 at 11:54 AM Jean-Marc Spaggiari <
jean-m...@spaggiari.org> wrote:

> Hi all,
>
> We have all CheckAndxxx operations, where we verify something and if the
> condition is true we perform the operatoin (Put, Delete, Mutation, etc.).
>
> I'm looking for a GetAndPut operation. Where in a single atomic call, I can
> get the actual value of a cell (if any), and perform the put. Working on a
> usecase where this might help.
>
> Do we have anything like that? I can simulate by doing a Get then a
> CheckAndPut, but that's 2 calls. Trying to save one call ;)
>
> Do we have anything like that?
>
> Thanks
>
> JMS
>


[jira] [Created] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-20 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-22075:
-

 Summary: Potential data loss when MOB compaction fails
 Key: HBASE-22075
 URL: https://issues.apache.org/jira/browse/HBASE-22075
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


When MOB compaction fails during last step (bulk load of a newly created 
reference file) there is a high chance of a data loss due to partially loaded 
reference file, cells of which refer to (now) non-existent MOB file. The newly 
created MOB file is deleted automatically in case of a MOB compaction failure, 
but some cells with the references to this file might be loaded to HBase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21936) Disable split/merge of a table during snapshot

2019-02-20 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-21936.
---
Resolution: Invalid

> Disable split/merge of a table during snapshot
> --
>
> Key: HBASE-21936
> URL: https://issues.apache.org/jira/browse/HBASE-21936
> Project: HBase
>  Issue Type: Bug
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.3.0
>
> Attachments: HBASE-21936-master-v1.patch
>
>
> https://issues.apache.org/jira/browse/HBASE-17942 has introduced per table 
> split/merge enablement. This new feature should be used during table's 
> snapshot to avoid failure due to concurrent splits/merges.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21936) Disable split/merge of a table before taking snapshot

2019-02-19 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-21936:
-

 Summary: Disable split/merge of a table before taking snapshot
 Key: HBASE-21936
 URL: https://issues.apache.org/jira/browse/HBASE-21936
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


https://issues.apache.org/jira/browse/HBASE-17942 has introduced per table 
split/merge enablement. This new feature should be used during table's snapshot 
to avoid failing snapshots due to concurrent splits/merges.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21688) Address WAL filesystem issues

2019-01-22 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov reopened HBASE-21688:
---

Opened for amendment.

> Address WAL filesystem issues
> -
>
> Key: HBASE-21688
> URL: https://issues.apache.org/jira/browse/HBASE-21688
> Project: HBase
>  Issue Type: Bug
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21688-v1.patch
>
>
> Scan and fix code base to use new way of instantiating WAL File System. 
> https://issues.apache.org/jira/browse/HBASE-21457?focusedCommentId=16734688=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16734688



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21457) BackupUtils#getWALFilesOlderThan refers to wrong FileSystem

2019-01-07 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-21457.
---
Resolution: Fixed

> BackupUtils#getWALFilesOlderThan refers to wrong FileSystem
> ---
>
> Key: HBASE-21457
> URL: https://issues.apache.org/jira/browse/HBASE-21457
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Janos Gub
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21457.v1.txt, 21457.v2.txt, 21457.v3.txt, 21457.v3.txt, 
> 21457.v4.txt, HBASE-21457.add.patch
>
>
> Janos reported seeing backup test failure when testing a local HDFS for WALs 
> while using WASB/ADLS only for store files.
> Janos spotted the code in BackupUtils#getWALFilesOlderThan which uses HBase 
> root dir for retrieving WAL files.
> We should use the helper methods from CommonFSUtils.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21688) Address WAL filesystem issues

2019-01-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-21688:
-

 Summary: Address WAL filesystem issues
 Key: HBASE-21688
 URL: https://issues.apache.org/jira/browse/HBASE-21688
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21457) BackupUtils#getWALFilesOlderThan refers to wrong FileSystem

2019-01-04 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov reopened HBASE-21457:
---

Opened for addendum.

> BackupUtils#getWALFilesOlderThan refers to wrong FileSystem
> ---
>
> Key: HBASE-21457
> URL: https://issues.apache.org/jira/browse/HBASE-21457
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Janos Gub
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 21457.v1.txt, 21457.v2.txt, 21457.v3.txt, 21457.v3.txt, 
> 21457.v4.txt
>
>
> Janos reported seeing backup test failure when testing a local HDFS for WALs 
> while using WASB/ADLS only for store files.
> Janos spotted the code in BackupUtils#getWALFilesOlderThan which uses HBase 
> root dir for retrieving WAL files.
> We should use the helper methods from CommonFSUtils.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: if changing the RingBuffer to priority Based Queue cause correctness issue in HBase?

2018-11-08 Thread Vladimir Rodionov
Should be handled in RPC queues first, before mutation op reaches
RingBuffer, I think.
But to answer your question: the only guarantee HBase provide (promises) is
strictly consistent writes and atomicity for
a single row mutation. No, won't break anything. Permutations of mutations
in an execution pipeline is
safe, because they can't violate HBase promises.

-Vlad

On Wed, Nov 7, 2018 at 1:49 PM Jing Liu  wrote:

> Hi,
>
> I'am trying to add priority to schedule different types of requests in
> HBase. But the Write-ahead logging use RingBuffer which
> is essentially a FIFO queue makes it hard. In this case, let's say if the
> low priority request already queued in the RingBuffer, the high priority
> request can not be executed before all those queued low priority request.
> I'm wondering if I change the FIFO queue into Priority-based queue
> will violate the write consistency guarantee or other issues?
>
> Thanks,
> Jing
>


[jira] [Created] (HBASE-21219) Hbase incremental backup fails with null pointer exception

2018-09-21 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-21219:
-

 Summary: Hbase incremental backup fails with null pointer exception
 Key: HBASE-21219
 URL: https://issues.apache.org/jira/browse/HBASE-21219
 Project: HBase
  Issue Type: Bug
  Components: backuprestore
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 3.0.0


hbase backup create incremental hdfs:///bkpHbase_Test/bkpHbase_Test2 -t 
bkpHbase_Test2 
2018-09-21 15:35:31,421 INFO [main] impl.TableBackupClient: Backup 
backup_1537524313995 started at 1537524331419. 2018-09-21 15:35:31,454 INFO 
[main] impl.IncrementalBackupManager: Execute roll log procedure for 
incremental backup ... 2018-09-21 15:35:32,985 ERROR [main] 
impl.TableBackupClient: Unexpected Exception : java.lang.NullPointerException 
java.lang.NullPointerException at 
org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getLogFilesForNewBackup(IncrementalBackupManager.java:309)
 at 
org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getIncrBackupLogFileMap(IncrementalBackupManager.java:103)
 at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:276)
 at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:601)
 at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:347)
 at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:138) 
at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:171) at 
org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:204) at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at 
org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:179) 
2018-09-21 15:35:32,989 ERROR [main] impl.TableBackupClient: 
BackupId=backup_1537524313995,startts=1537524331419,failedts=1537524332989,failedphase=PREPARE_INCREMENTAL,failedmessage=null
 2018-09-21 15:35:57,167 ERROR [main] impl.TableBackupClient: Backup 
backup_1537524313995 failed. 

Backup session finished. Status: FAILURE 2018-09-21 15:35:57,175 ERROR [main] 
backup.BackupDriver: Error running 

command-line tool java.io.IOException: java.lang.NullPointerException at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:281)
 at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:601)
 at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:347)
 at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:138) 
at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:171) at 
org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:204) at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at 
org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:179) Caused 
by: java.lang.NullPointerException at 
org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getLogFilesForNewBackup(IncrementalBackupManager.java:309)
 at 
org.apache.hadoop.hbase.backup.impl.IncrementalBackupManager.getIncrBackupLogFileMap(IncrementalBackupManager.java:103)
 at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:276)
 ... 7 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Rough notes from dev meetup, day after hbaseconasia 2018, saturday morning

2018-08-20 Thread Vladimir Rodionov
C* fares much better with their (very limited) CQL than HBase with its
advanced Phoenix.
Just saying.

My 2c

-Vlad



On Mon, Aug 20, 2018 at 12:06 PM, Andrew Purtell 
wrote:

> It would be helpful if someone could forward the relevant bits of Phoenix
> discussion to the Phoenix dev list. One thing I know that project lacks is
> usability feedback. I don't see anyone writing in with suggestions, mainly
> complaining about it on a HBase list somewhere. Could just be I lack
> perspective and those conversations are happening somewhere, but I am a
> subscriber to all of the relevant lists and this is my observation. If a
> correct observation, this is not really fair. I work somewhere that has
> Phoenix in production. There is no doubt the attempt to implement RDBMS
> functionality *inside* HBase as an add on component is a challenging
> undertaking. However, any would be substitute I have seen to date either
> doesn't actually attempt the same challenges, or takes a shortcut which
> renders any comparison to the proverbial "apples and oranges". The tell
> here is the notion of *lightweight* SQL access. Reads as a tremendous
> limitation of scope. SQL is a huge standard incorporating 30+ years of
> development in relational systems capabilities and semantics. We will get
> into trouble if we ever attempt a "lightweight" SQL interface to HBase that
> fails to match expectations which automatically attach to the effort
> whenever you claim it to be a SQL interface. This is a cross the Phoenix
> project already bears. If SQL support is really the goal it would be better
> to assist there. Or, if the goal is the barest minimal SQL-like thing
> someone needs to support their use case, and then contribute to HBase, call
> it something else, like Cassandra did with CQL. Would be like the other
> connectors - thrift, REST, Kafka, etc. - and should go into the connectors
> repo, in my opinion.
>
>
> On Sun, Aug 19, 2018 at 3:50 AM Stack  wrote:
>
> >
> > Next we went over backburner items mention on previous day staring with
> > SQL-like access.
> > What about lightweight SQL support?
> > At Huawei... they have a project going for lightweight SQL support in
> hbase
> > based-on calcite.
> > For big queries, they'd go to sparksql.
> > Did you look at phoenix?
> > Phoenix is complicated, difficult. Calcite migration not done in Phoenix
> > (Sparksql is not calcite-based).
> > Talk to phoenix project about generating a lightweight artifact. We could
> > help with build. One nice idea was building with a cut-down grammar, one
> > that removed all the "big stuff" and problematics. Could return to the
> user
> > a nice "not supported" if they try to do a 10Bx10B join.
> > An interesting idea about a facade query analyzer making transfer to
> > sparksql if big query. Would need stats.
>


[jira] [Created] (HBASE-21077) MR job launched by hbase incremental backup command failed with FileNotFoundException

2018-08-20 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-21077:
-

 Summary: MR job launched by hbase incremental backup command 
failed with FileNotFoundException
 Key: HBASE-21077
 URL: https://issues.apache.org/jira/browse/HBASE-21077
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20729) B BackupLogCleaner must ignore ProcV2 WAL files

2018-06-13 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-20729:
-

 Summary: B BackupLogCleaner must ignore ProcV2 WAL files
 Key: HBASE-20729
 URL: https://issues.apache.org/jira/browse/HBASE-20729
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


These are WAL files B does need for backup. The issue does not affect B 
functionality though. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20631) B: Merge command enhancements

2018-05-23 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-20631:
-

 Summary: B: Merge command enhancements 
 Key: HBASE-20631
 URL: https://issues.apache.org/jira/browse/HBASE-20631
 Project: HBase
  Issue Type: New Feature
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Currently, merge supports only list of backup ids, which users must provide. 
Date range merges seem more convenient for users. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20630) B: Delete command enhancements

2018-05-23 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-20630:
-

 Summary: B: Delete command enhancements
 Key: HBASE-20630
 URL: https://issues.apache.org/jira/browse/HBASE-20630
 Project: HBase
  Issue Type: New Feature
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Make the command more useable. Currently, user needs to provide list of backup 
ids to delete. It would be nice to have more convenient options, such as: 
deleting all backups which are older than XXX days, etc 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20547) Restore from backup will fail if done from a different file system

2018-05-08 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-20547:
-

 Summary: Restore from backup will fail if done from a different 
file system
 Key: HBASE-20547
 URL: https://issues.apache.org/jira/browse/HBASE-20547
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-15227) HBase Backup Phase 3: Fault tolerance (client/server) support

2018-04-03 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-15227.
---
Resolution: Fixed

Done.

> HBase Backup Phase 3: Fault tolerance (client/server) support
> -
>
> Key: HBASE-15227
> URL: https://issues.apache.org/jira/browse/HBASE-15227
> Project: HBase
>  Issue Type: Task
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Major
>  Labels: backup
> Attachments: HBASE-15227-v3.patch, HBASE-15277-v1.patch
>
>
> System must be tolerant to faults: 
> # Backup operations MUST be atomic (no partial completion state in the backup 
> system table)
> # Process must detect any type of failures which can result in a data loss 
> (partial backup or partial restore) 
> # Proper system table state restore and cleanup must be done in case of a 
> failure
> # Additional utility to repair backup system table and corresponding file 
> system cleanup must be implemented
> h3. Backup
> h4. General FT framework implementation 
> Before actual backup operation starts, snapshot of a backup system table is 
> taken and system table is updated with *ACTIVE_SNAPSHOT* flag. The flag will 
> be removed upon backup completion. 
> In case of *any* server-side failures, client catches errors/exceptions and 
> handles them:
> # Cleans up backup destination (removes partial backup data)
> # Cleans up any temporary data
> # Deletes  any active snapshots of a tables being backed up (during full 
> backup we snapshot tables)
> # Restores backup system table from snapshot
> # Deletes backup system table snapshot (we read snapshot name from backup 
> system table before)
> In case of *any* client-side failures:
> Before any backup or restore operation run we check backup system table on 
> *ACTIVE_SNAPSHOT*, if flag is present, operation aborts with a message that 
> backup repair tool (see below) must be run
> h4. Backup repair tool
> The command line tool *backup repair* which executes the following steps:
> # Reads info of a last failed backup session
> # Cleans up backup destination (removes partial backup data)
> # Cleans up any temporary data
> # Deletes  any active snapshots of a tables being backed up (during full 
> backup we snapshot tables)
> # Restores backup system table from snapshot
> # Deletes backup system table snapshot (we read snapshot name from backup 
> system table before)
> h4. Detection of a partial loss of data
> h5. Full backup  
> Export snapshot operation (?).
> We count files and check sizes before and after DistCp run
> h5. Incremental backup 
> Conversion of WAL to HFiles, when WAL file is moved from active to archive 
> directory. The code is in place to handle this situation
> During DistCp run (same as above)
> h3. Restore
> This operation does not modify backup system table and is idempotent. No 
> special FT is required.   
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19969) Improve FT in merge operation

2018-02-09 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-19969:
-

 Summary: Improve FT in merge operation
 Key: HBASE-19969
 URL: https://issues.apache.org/jira/browse/HBASE-19969
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19568) Restore of HBase table using incremental backup doesn't restore rows from an earlier incremental backup

2017-12-20 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-19568:
-

 Summary: Restore of HBase table using incremental backup doesn't 
restore rows from an earlier incremental backup
 Key: HBASE-19568
 URL: https://issues.apache.org/jira/browse/HBASE-19568
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] Plan to avoid backup/restore removal from 2.0

2017-12-01 Thread Vladimir Rodionov
Thanks, Mike

#1 is done
#4 is done, but not committed yet
#3 is questionable to say the least. All B tools provide guarantee of
operation correctness if operation succeeds. Otherwise, what is the point
 of separate Fault-tolerance work? FT includes correctness guarantee as
well, we track all the needed WAL files or bulk-loaded files during
incremental backup
 and guarantee that every single file will be converted and moved to
backup destination.If you need additional guarantee - restore backup into
separate table and do verification yourself.

#2 is ongoing

On Fri, Dec 1, 2017 at 10:30 AM, Mike Drob <md...@apache.org> wrote:

> The list is what Josh proposed in the original email to the list.
>
> What is the JIRA for #3?
>
> On Fri, Dec 1, 2017 at 12:20 PM, Vladimir Rodionov <vladrodio...@gmail.com
> >
> wrote:
>
> > Where did you get this from, Stack?
> >
> > I am doing scale testing now and this is last task on *my* list for
> beta-1.
> >
> > On Thu, Nov 30, 2017 at 10:27 PM, Stack <st...@duboce.net> wrote:
> >
> > > On Tue, Nov 7, 2017 at 8:30 PM, Josh Elser <els...@apache.org> wrote:
> > >
> > > > Folks,
> > > >
> > > > I've been working with Vlad and Ted offline to make sure we have a
> plan
> > > > that addresses the implementation gaps Vlad sees and the
> > > barriers-for-entry
> > > > previously stated to keep the feature in HBase 2.0. My hope is that
> > this
> > > > can be an honest discussion given 2.0-beta timelines, with a concrete
> > > > action plan. I'm trying my best to not re-hash the
> > > logic/reasoning/caveats
> > > > behind previous concerns; anything folks feel is a blocker that I
> > haven't
> > > > covered below is unintentional.
> > > >
> > > > The list:
> > > >
> > > > 1. Documentation. It must be updated and committed, ensuring it
> covers
> > > the
> > > > details operators/architects need to know to use it effectively
> > > > (HBASE-16574). Vlad will help with content, myself and/or Frank will
> > get
> > > it
> > > > updated to asciidoc.
> > > >
> > > > 2. Distributed testing missing. Vlad has taken my previous document
> on
> > > > goals and translated that into an implementation outline[1]. Ted and
> I
> > > have
> > > > already weighed in -- I believe it hits the salient points for the
> > > quality
> > > > of testing we're looking for. I'll get started on this while Vlad
> does
> > #4
> > > > (after consensus on approach, of course). Needs JIRA issue (maybe?).
> > > >
> > > > 3. Operator utility to verify backups. In abstract, this should just
> be
> > > > the same guts of a tool like VerifyReplication. In practice, this
> > should
> > > be
> > > > the same code that #3 uses (if not _actually_ the same guts as
> > > > VerifyReplication). The hope is that this will be encapsulated
> > > (time-wise)
> > > > by #3. Needs JIRA issue (maybe?).
> > > >
> > > > 4. Polish DistCP for bulk-loaded files/fault-tolerance
> (HBASE-17852). I
> > > > don't have specifics here -- will rely on Vlad to correct me if
> > there's a
> > > > better JIRA issue to track than the aforementioned. Will rely on
> > details
> > > to
> > > > show up the JIRA issue to track it.
> > > >
> > > > Current due dates:
> > > >
> > > >
> > > Checking in on the plan.
> > >
> > >
> > > > 1. End of week (2017/11/10)
> > > >
> > >
> > > I believe this is done.
> > >
> > >
> > > > 2. Before US Thanksgiving (2017/11/22)
> > > > 3. Same as #2
> > > > 4. Same as #1
> > > >
> > > >
> > > These were not done in time for thanksgiving? Correct me if I'm wrong.
> > >
> > > Thanks,
> > > St.Ack
> > >
> > >
> > >
> > > > My current thought is that this is reasonable for implementation
> times,
> > > > and would not derail the rest of the beta-1 train. I appreciate the
> > > > patience from all parties, and I hope that those trying to make this
> > > better
> > > > can find a little more time to give some feedback. Thanks for the
> long
> > > read
> > > > if nothing else.
> > > >
> > > > - Josh
> > > >
> > > > [1] https://docs.google.com/document/d/1xbPlLKjOcPq2LDqjbSkF6uND
> > > > AG0mzgOxek6P3POLeMc/edit?usp=sharing
> > > >
> > >
> >
>


Re: [DISCUSS] Plan to avoid backup/restore removal from 2.0

2017-11-30 Thread Vladimir Rodionov
Nope, Mike. Fortunately, 99% of a FT code will remains after introducing
concurrent sessions support

Just two lines will be changed: TakeSnapshot -> BeginTX, RestoreSnapshot ->
RollbackTx

-Vlad

On Thu, Nov 30, 2017 at 7:20 PM, Mike Drob  wrote:

> Bringing this thread up again, because I don't really know where else to
> ask...
>
> The current backup solution snapshots the backup metadata table and
> will restore-via-snapshot in case something goes wrong (or this is still in
> a patch? unclear if this has been committed or not, since there's a ton of
> code to dig through)
>
> AFAICT this is the major reason that we do not support concurrent backup or
> restore operations. (Are there others? Also couldn't find this.)
>
> The fault tolerance that we're working on now will need to be gutted and
> completely rewritten for the future improvements. I get that this is all
> internal and as long as we make it seamless for the operators then we have
> wide latitude to make our own changes. But an important question is just
> because we can, does it mean we should do this? I'm concerned that we're
> writing code that we know will get thrown away and replaced, except we will
> have to continue to support it for as long as 2.0 is an active branch.
>
> Mike
>
>
> On Wed, Nov 15, 2017 at 3:05 PM, Josh Elser  wrote:
>
> > On 11/14/17 4:54 PM, Mike Drob wrote:
> >
> >> I can see a small section on the documentation update I've already been
> >>> hacking on to include details on the issue "We can't help you secure
> >>> where
> >>> you put the data". Given how many instances of "globally readable S3
> >>> bucket" I've seen recently, this strikes me as prudent.
> >>>
> >>> I would prefer this to be a giant, hard to miss, red letters, all caps
> >> warning; not a small section. I do think it is our responsibility for
> >> telling users how to configure the backup/restore process for
> >> communicating
> >> with secure systems. Or, at a minimum, documenting how we pass arbitrary
> >> configuration options that can then be used to communicate with said
> >> systems.
> >>
> >
> > :D
> >
> > For example, if we support writing backups to S3, then we should have a
> way
> >> to specify an Auth string and maybe even some of the custom headers like
> >> x-amz-acl. We don't have to explicitly enumerate best practices, but if
> >> the
> >> only option is to write to a globally open bucket, then I don't think we
> >> should advertise writing to S3 as an available option.
> >>
> >> Similarly, if we tell people that they can send backups to HDFS, then we
> >> should give them the hooks to correctly interface with a kerberized
> HDFS.
> >>
> >> Maybe this is already in the proposed patch, I haven't gone looking yet.
> >>
> >
> > Nope. I actually meant to include this in the patch I re-rolled today but
> > forgot. Let me update once more.
> >
> > Thanks again, Mike. Good questions/feedback!
> >
>


[jira] [Reopened] (HBASE-16391) Multiple backup/restore sessions support

2017-11-29 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov reopened HBASE-16391:
---
  Assignee: Vladimir Rodionov

> Multiple backup/restore sessions support
> 
>
> Key: HBASE-16391
> URL: https://issues.apache.org/jira/browse/HBASE-16391
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: HBASE-7912
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
> Fix For: 2.1.0
>
>
> Multiple simultaneous sessions support for backup/restore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] Performance degradation in master (compared to 2-alpha-1)

2017-11-28 Thread Vladimir Rodionov
>> CompletableFuture more and more in our code base, but at least before
>>jdk8u131 there is a performance regression for CompletableFuture.

The performance regression is 2x, so this should be something else

On Tue, Nov 28, 2017 at 6:49 PM, 宾莉金(binlijin)  wrote:

> HBASE-19338 has committed, do you want to update the master branch and test
> it again?
>
> 2017-11-29 10:32 GMT+08:00 张铎(Duo Zhang) :
>
> > And one thing may effect performance is that, now we rely on
> > CompletableFuture more and more in our code base, but at least before
> > jdk8u131 there is a performance regression for CompletableFuture. So
> > consider moving to the newest jdk if you are still on an older version.
> >
> > Thanks.
> >
> > 2017-11-29 3:35 GMT+08:00 Mike Drob :
> >
> > > Ted,
> > >
> > > To clarify, I'm talking about a hypothetical bugfix that impacted
> > > performance, not to imply that I know of a specific such change.
> > >
> > > I've seen it often enough before that performance is blazing fast at
> the
> > > expense of accuracy and people are surprised when correctness takes
> > longer.
> > >
> > >
> > > Mike
> > >
> > > On Tue, Nov 28, 2017 at 1:01 PM, Ted Yu  wrote:
> > >
> > > > Mike:
> > > > Which JIRA was the important bug-fix ?
> > > >
> > > > Thanks
> > > >
> > > > On Tue, Nov 28, 2017 at 9:16 AM, Mike Drob  wrote:
> > > >
> > > > > Eshcar - do you have time to try the other alpha releases and see
> > where
> > > > > exactly we introduced the regressions?
> > > > >
> > > > > Also, I'm worried that the performance regression may be related to
> > an
> > > > > important bug-fix, where before we may have had fast writes but
> also
> > > > risked
> > > > > incorrect behavior somehow.
> > > > >
> > > > > Mike
> > > > >
> > > > > On Tue, Nov 28, 2017 at 2:48 AM, Eshcar Hillel
> >  > > >
> > > > > wrote:
> > > > >
> > > > > > I agree, so will wait till we focus on performance.
> > > > > > Just one more update, I also ran the same experiment (write-only)
> > > with
> > > > > > banch-2 beta-1.Here is a summary of the throughput I see in each
> > > > > tag/branch:
> > > > > > ---  | BASIC | NONE  |
> > > > > > ---2-alpha-1| 110K   | 80K |
> > > > > > 2-beta-1 |  81K| 62K |
> > > > > > master| 60K | 55K |---
> > > > > > This means there are multiple sources for the regression.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > On Saturday, November 25, 2017, 7:44:01 AM GMT+2, 张铎(Duo
> > Zhang) <
> > > > > > palomino...@gmail.com> wrote:
> > > > > >
> > > > > >  I think first we need a release plan on when we will begin to
> > focus
> > > on
> > > > > the
> > > > > > performance issue?
> > > > > >
> > > > > > I do not think it is a good time to focus on performance issue
> now
> > as
> > > > we
> > > > > > haven’t stabilized our build yet. The performance regression may
> > come
> > > > > back
> > > > > > again after some bug fixes and maybe we use a wrong way to
> increase
> > > > > > performance and finally we find that it is just a bug...
> > > > > >
> > > > > > Of course I do not mean we can not do any performance related
> > issues
> > > > now,
> > > > > > for example, HBASE-19338 is a good catch and can be fixed right
> > now.
> > > > > >
> > > > > > And also, for AsyncFSWAL and in memory compaction, we need to
> > > consider
> > > > > the
> > > > > > performance right now as they are born for performance, but let’s
> > > focus
> > > > > on
> > > > > > the comparison to other policies, not a previous release so we
> can
> > > find
> > > > > the
> > > > > > correct things to fix.
> > > > > >
> > > > > > Of course, if there is a big performance downgrading comparing to
> > the
> > > > > > previous release and we find it then we should tell others, just
> > like
> > > > > this
> > > > > > email. An earlier notification is always welcomed.
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > Stack 于2017年11月25日 周六13:22写道:
> > > > > >
> > > > > > > On Thu, Nov 23, 2017 at 7:35 AM, Eshcar Hillel
> > > >  > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Happy Thanksgiving all,
> > > > > > > >
> > > > > > >
> > > > > > > And to you Eshcar.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > In recent benchmarks I ran in HBASE-18294 I discovered major
> > > > > > performance
> > > > > > > > degradation of master code w.r.t 2-alpha-1 code.I am running
> > > > > write-only
> > > > > > > > workload (similar to the one reported in HBASE-16417). I am
> > using
> > > > the
> > > > > > > same
> > > > > > > > hardware and same configuration settings (specifically, I
> > testes
> > > > both
> > > > > > > basic
> > > > > > > > memstore compaction with optimal parameters, and no memsore
> > > > > > > > compaction).While in 

Re: [DISCUSS] Performance degradation in master (compared to 2-alpha-1)

2017-11-28 Thread Vladimir Rodionov
I do not know if this is related, but test suite for backup now runs 57 min
on master. It used to run under 34

-Vlad

On Tue, Nov 28, 2017 at 9:16 AM, Mike Drob  wrote:

> Eshcar - do you have time to try the other alpha releases and see where
> exactly we introduced the regressions?
>
> Also, I'm worried that the performance regression may be related to an
> important bug-fix, where before we may have had fast writes but also risked
> incorrect behavior somehow.
>
> Mike
>
> On Tue, Nov 28, 2017 at 2:48 AM, Eshcar Hillel 
> wrote:
>
> > I agree, so will wait till we focus on performance.
> > Just one more update, I also ran the same experiment (write-only) with
> > banch-2 beta-1.Here is a summary of the throughput I see in each
> tag/branch:
> > ---  | BASIC | NONE  |
> > ---2-alpha-1| 110K   | 80K |
> > 2-beta-1 |  81K| 62K |
> > master| 60K | 55K |---
> > This means there are multiple sources for the regression.
> >
> > Thanks
> >
> > On Saturday, November 25, 2017, 7:44:01 AM GMT+2, 张铎(Duo Zhang) <
> > palomino...@gmail.com> wrote:
> >
> >  I think first we need a release plan on when we will begin to focus on
> the
> > performance issue?
> >
> > I do not think it is a good time to focus on performance issue now as we
> > haven’t stabilized our build yet. The performance regression may come
> back
> > again after some bug fixes and maybe we use a wrong way to increase
> > performance and finally we find that it is just a bug...
> >
> > Of course I do not mean we can not do any performance related issues now,
> > for example, HBASE-19338 is a good catch and can be fixed right now.
> >
> > And also, for AsyncFSWAL and in memory compaction, we need to consider
> the
> > performance right now as they are born for performance, but let’s focus
> on
> > the comparison to other policies, not a previous release so we can find
> the
> > correct things to fix.
> >
> > Of course, if there is a big performance downgrading comparing to the
> > previous release and we find it then we should tell others, just like
> this
> > email. An earlier notification is always welcomed.
> >
> > Thanks.
> >
> > Stack 于2017年11月25日 周六13:22写道:
> >
> > > On Thu, Nov 23, 2017 at 7:35 AM, Eshcar Hillel  >
> > > wrote:
> > >
> > > > Happy Thanksgiving all,
> > > >
> > >
> > > And to you Eshcar.
> > >
> > >
> > >
> > > > In recent benchmarks I ran in HBASE-18294 I discovered major
> > performance
> > > > degradation of master code w.r.t 2-alpha-1 code.I am running
> write-only
> > > > workload (similar to the one reported in HBASE-16417). I am using the
> > > same
> > > > hardware and same configuration settings (specifically, I testes both
> > > basic
> > > > memstore compaction with optimal parameters, and no memsore
> > > > compaction).While in 2-alpha-1 code I see throughput of ~110Kops for
> > > basic
> > > > compaction and ~80Kops for no compaction, in the master code I get
> only
> > > > 60Kops and 55Kops, respectively. *This is almost 50% reduction in
> > > > performance*.
> > > > (1) Did anyone else noticed such degradation?(2) Do we have any
> > > systematic
> > > > automatic/semi-automatic method to track the sources of this
> > performance
> > > > issue?
> > > > Thanks,Eshcar
> > > >
> > >
> > >
> > > On #1, no. I've not done perf compare. I wonder if later alpha versions
> > > include the regression (I'll have to check and see).
> > >
> > > On #2, again no. I intend to do a bit of perf tuning and compare before
> > > release.
> > >
> > > If you don't file an issue, I will do so later for myself as a task to
> > > compare at least to alpha-1.
> > >
> > > Thanks Eshcar,
> > >
> > > St.Ack
> > >
> >
>


Re: Any Eclipse users among devs?

2017-11-16 Thread Vladimir Rodionov
Sure

I will try, Josh

-Vlad

On Thu, Nov 16, 2017 at 10:57 AM, Josh Elser  wrote:

> I still use Eclipse for most of my development and have been fighting it
> being broken for branch-2 (ironically, as a result of some "fixes" I made
> previously).
>
> Anyone with a setup who can test a clean import as a part of
> https://issues.apache.org/jira/browse/HBASE-19267 would be greatly
> appreciated.
>
> Thanks!
>
> - Josh
>


Re: [DISCUSS] Plan to avoid backup/restore removal from 2.0

2017-11-13 Thread Vladimir Rodionov
Thanks, Mike

We will take a look.

-Vlad

On Mon, Nov 13, 2017 at 11:45 AM, Mike Drob  wrote:

> Sure, I don't think there are any issue with sharing this publicly, since
> the code has only gone out in alpha releases.
>
> The suspect lines in IncrementalTableBackupClient are 163 and 326. I'm
> still working on validating the call path that leads to those getting
> flagged.
>
> The issues in MapReduceBackupCopyJob are on lines 386, 405, and 407.
>
> All of them relate to un-sanitized inputs in one way or another.
>
> On Mon, Nov 13, 2017 at 12:50 PM, Ted Yu  wrote:
>
> > Mike:
> > Can you share your finding w.r.t. IncrementalTableBackupClient and
> > MapReduceBackupCopyJob
> > ?
> >
> > IncrementalTableBackupClient utilizes WALPlayer directly.
> >
> > I wonder what vulnerability there is.
> >
> > Thanks
> >
> > On Mon, Nov 13, 2017 at 9:02 AM, Mike Drob  wrote:
> >
> > > I know I'm late to the party here, but I've got another potential
> blocker
> > > to add.
> > >
> > > We just ran an HP fortify scan internally and the results did not look
> > > good, specifically on IncrementalTableBackupClient and
> > > MapReduceBackupCopyJob. I'm still sorting through whether these are
> > > actually exploitable, or whether it's a symptom of MapReduce being an
> > > arbitrary code execution framework anyway but this does make me wonder
> > > about the overall security posture.
> > >
> > > I see  "HBase Backup/Restore Phase 3: Security"[1] resolved as "Later"
> > and
> > > claims that it will be implemented in the client, both of which make me
> > > uncomfortable. Security Later is a general bad practice, and it is very
> > > rarely correct to rely on client-side security for anything.
> > >
> > > Is there another issue that covers security? Do we rely completely on
> > HDFS
> > > security here for more than just the DistCP? What kind of testing has
> > been
> > > done with security, do we have assurances that the backups aren't
> > > accidentally exposing tables to the world?
> > >
> > > Thanks,
> > > Mike
> > >
> > > [1]: https://issues.apache.org/jira/browse/HBASE-14138
> > >
> > > On Mon, Nov 13, 2017 at 10:38 AM, Josh Elser 
> wrote:
> > >
> > > > On 11/11/17 5:31 PM, Stack wrote:
> > > >
> > > >> Don't want to make any assumptions, but I hope the lack of hard
> > > objection
> > > >>> can be interpreted as (begrudging, perhaps) acceptance of the plan.
> > Let
> > > >>> me/us know when possible, please!
> > > >>>
> > > >>>
> > > >>> Plan seems fine.
> > > >>
> > > >> Are you the owner of this feature now Josh or just shepherding it
> in?
> > > >>
> > > >
> > > > Thanks, Stack.
> > > >
> > > > Good question: should have included that out-right. Vlad, Ted, and
> > myself
> > > > had a chat on this last week.
> > > >
> > > > While Vlad is polishing HBASE-17852 and HBASE-17825, I told him I'll
> > help
> > > > out with the HBASE-18892 (testing) and the Book update. Was waiting
> for
> > > > some consensus on the testing gdoc before picking that up.
> > > >
> > > > I think Vlad is still the owner, but you could certainly call me a
> > > > shepherd. I also answer to "sherpa" ;)
> > > >
> > >
> >
>


Re: [DISCUSS] Plan to avoid backup/restore removal from 2.0

2017-11-13 Thread Vladimir Rodionov
Yes, you are correct, Sean :)

On Mon, Nov 13, 2017 at 10:16 AM, Sean Busbey <bus...@apache.org> wrote:

> On Mon, Nov 13, 2017 at 11:46 AM, Vladimir Rodionov
> <vladrodio...@gmail.com> wrote:
> >>>Is there a high-level overview of what the feature should be able to do
> in
> >>>hbase-2? (The issue HBASE-14414  has a bunch of issues hanging off it.
> It
> >>>is hard to get an overview).
> >
> > Yes, it is in hbase book, Michael. HBASE-16754
> >
>
>
> HBASE-16754 Regions failing compaction due to referencing non-existent
> store file
>
>
> Probably HBASE-16574?
>


Re: [DISCUSS] Plan to avoid backup/restore removal from 2.0

2017-11-13 Thread Vladimir Rodionov
>>Is there a high-level overview of what the feature should be able to do in
>>hbase-2? (The issue HBASE-14414  has a bunch of issues hanging off it. It
>>is hard to get an overview).

Yes, it is in hbase book, Michael. HBASE-16754

>>Is there
>>anything on what user can expect in terms of size consumptions, resources
>>consumed effecting a backup, or how long a restore will take? I would
think
>>it useful I'd imagine, particularly the latter bit of info as a rough
>>gauge.

Resource consumptions for backup and restore are defined by YARN resource
allocation
to a queue we run both in : backup and restore. That is probably should be
mentioned explicitly in a a doc

Restore is completely sequence of M/R jobs, backup has some non M/R  stages:
snapshot (full backup) and distributed log roll stage


>> Has anyone tried the example in the doc? (Backup to s3?).

Yes,  as far as I remember, some time ago. We will include s3 testing into
beta2 testing cycle

-Vlad

On Mon, Nov 13, 2017 at 9:02 AM, Mike Drob  wrote:

> I know I'm late to the party here, but I've got another potential blocker
> to add.
>
> We just ran an HP fortify scan internally and the results did not look
> good, specifically on IncrementalTableBackupClient and
> MapReduceBackupCopyJob. I'm still sorting through whether these are
> actually exploitable, or whether it's a symptom of MapReduce being an
> arbitrary code execution framework anyway but this does make me wonder
> about the overall security posture.
>
> I see  "HBase Backup/Restore Phase 3: Security"[1] resolved as "Later" and
> claims that it will be implemented in the client, both of which make me
> uncomfortable. Security Later is a general bad practice, and it is very
> rarely correct to rely on client-side security for anything.
>
> Is there another issue that covers security? Do we rely completely on HDFS
> security here for more than just the DistCP? What kind of testing has been
> done with security, do we have assurances that the backups aren't
> accidentally exposing tables to the world?
>
> Thanks,
> Mike
>
> [1]: https://issues.apache.org/jira/browse/HBASE-14138
>
> On Mon, Nov 13, 2017 at 10:38 AM, Josh Elser  wrote:
>
> > On 11/11/17 5:31 PM, Stack wrote:
> >
> >> Don't want to make any assumptions, but I hope the lack of hard
> objection
> >>> can be interpreted as (begrudging, perhaps) acceptance of the plan. Let
> >>> me/us know when possible, please!
> >>>
> >>>
> >>> Plan seems fine.
> >>
> >> Are you the owner of this feature now Josh or just shepherding it in?
> >>
> >
> > Thanks, Stack.
> >
> > Good question: should have included that out-right. Vlad, Ted, and myself
> > had a chat on this last week.
> >
> > While Vlad is polishing HBASE-17852 and HBASE-17825, I told him I'll help
> > out with the HBASE-18892 (testing) and the Book update. Was waiting for
> > some consensus on the testing gdoc before picking that up.
> >
> > I think Vlad is still the owner, but you could certainly call me a
> > shepherd. I also answer to "sherpa" ;)
> >
>


[jira] [Resolved] (HBASE-17133) Backup documentation update

2017-11-10 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-17133.
---
Resolution: Duplicate

Duplicate of HBASE-16574

> Backup documentation update
> ---
>
> Key: HBASE-17133
> URL: https://issues.apache.org/jira/browse/HBASE-17133
> Project: HBase
>  Issue Type: Bug
>    Reporter: Vladimir Rodionov
>Priority: Critical
>  Labels: backup
> Fix For: 2.0.0
>
>
> We need to update backup doc to sync it with the current implementation and 
> to add section for current limitations:
> {quote}
> - if you write to the table with Durability.SKIP_WALS your data will not
> be in the incremental-backup
>  - if you bulkload files that data will not be in the incremental backup
> (HBASE-14417)
>  - the incremental backup will not only contains the data of the table you
> specified but also the regions from other tables that are on the same set
> of RSs (HBASE-14141) ...maybe a note about security around this topic
>  - the incremental backup will not contains just the "latest row" between
> backup A and B, but it will also contains all the updates occurred in
> between. but the restore does not allow you to restore up to a certain
> point in time, the restore will always be up to the "latest backup point".
>  - you should limit the number of "incremental" up to N (or maybe SIZE), to
> avoid replay time becoming the bottleneck. (HBASE-14135)
> {quote} 
> Update command line tool section
> Clarify restore backup section
> Add section on backup delete algorithm
> Add section on how backup image dependency chain works.
> Add section for configuration
> hbase.backup.enable=true
> hbase.master.logcleaner.plugins=YOUR_PUGINS,org.apache.hadoop.hbase.backup.master.BackupLogCleaner
> hbase.procedure.master.classes=YOUR_CLASSES,org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager
> hbase.procedure.regionserver.classes=YOUR_CLASSES,org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19211) B: update configuration string in BackupRestoreConstants

2017-11-07 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-19211:
-

 Summary: B: update configuration string in BackupRestoreConstants
 Key: HBASE-19211
 URL: https://issues.apache.org/jira/browse/HBASE-19211
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
Priority: Minor
 Fix For: 2.0.0-beta-1


To include custom region observer implementation for tracking bulk loading 
events.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSSION] Items to purge from branch-2 before we cut hbase-2.0.0-beta1.

2017-11-02 Thread Vladimir Rodionov
On doc,

We have great doc attached to HBASE-7912 (unfortunately, it is a little bit
obsolete now)

On Thu, Nov 2, 2017 at 10:31 AM, Vladimir Rodionov <vladrodio...@gmail.com>
wrote:

> >>To be clear, I wasn't listing requirements. I was having trouble with
> the
> >>absolute "There is no way to validate correctness of backup in a general
> >>case."
>
> I am waiting for response from feature requester on what they expect from
> verification.
> Until then, I would rephrase my statement: "I do not see how we can
> perform correct verification ..."
>
> On Thu, Nov 2, 2017 at 9:20 AM, Stack <st...@duboce.net> wrote:
>
>> On Thu, Nov 2, 2017 at 5:51 AM, Josh Elser <els...@apache.org> wrote:
>>
>> > On 11/1/17 11:33 PM, Stack wrote:
>> >
>> >> On Wed, Nov 1, 2017 at 5:08 PM, Vladimir Rodionov<
>> vladrodio...@gmail.com>
>> >> wrote:
>> >>
>> >> There is no way to validate correctness of backup in a general case.
>> >>>
>> >>> You can restore backup into temp table, but then what? Read rows
>> >>> one-by-one
>> >>> from temp table and look them up
>> >>>
>> >>
>> >>
>> >> in a primary table? Won't work, because rows can be deleted or modified
>> >>> since the last backup was done.
>> >>>
>> >>>
>> >>> Replication has a verity table tool.
>> >>
>> >> You can ask a cluster not delete rows.
>> >>
>> >> You can read at a specific timestamp.
>> >>
>> >> Or you could create backups during an extended ITBLL. When ITBLL
>> >> completes,
>> >> verify it on src cluster. Create a table from the increment backups.
>> >> Verify
>> >> in the restore.
>> >>
>> >> Etc.
>> >>
>> >> St.Ack
>> >>
>> >
>> > I can definitely see a benefit of a tool which verifies the data
>> collected
>> > for a backup which:
>> >
>> > 1. Is batch in nature
>> > 2. Is ad-hoc (not intrinsically run for every backup session)
>> > 3. Relies/is-built on existing tooling (snapshots or other
>> > verification-like code)
>> >
>> > Thanks Stack. I think this is some good teasing of requirements from an
>> > otherwise very broad and untenable problem statement that we started
>> with
>> > (which lead to the knee-jerk).
>> >
>>
>> To be clear, I wasn't listing requirements. I was having trouble with the
>> absolute "There is no way to validate correctness of backup in a general
>> case." which is then seemingly being used to beat down any request for
>> verification tooling/testing that shows backup/restore works properly.
>> Good on you Josh,
>> S
>>
>
>


Re: [DISCUSSION] Items to purge from branch-2 before we cut hbase-2.0.0-beta1.

2017-11-02 Thread Vladimir Rodionov
>>To be clear, I wasn't listing requirements. I was having trouble with the
>>absolute "There is no way to validate correctness of backup in a general
>>case."

I am waiting for response from feature requester on what they expect from
verification.
Until then, I would rephrase my statement: "I do not see how we can perform
correct verification ..."

On Thu, Nov 2, 2017 at 9:20 AM, Stack <st...@duboce.net> wrote:

> On Thu, Nov 2, 2017 at 5:51 AM, Josh Elser <els...@apache.org> wrote:
>
> > On 11/1/17 11:33 PM, Stack wrote:
> >
> >> On Wed, Nov 1, 2017 at 5:08 PM, Vladimir Rodionov<vladrodionov@gmail.
> com>
> >> wrote:
> >>
> >> There is no way to validate correctness of backup in a general case.
> >>>
> >>> You can restore backup into temp table, but then what? Read rows
> >>> one-by-one
> >>> from temp table and look them up
> >>>
> >>
> >>
> >> in a primary table? Won't work, because rows can be deleted or modified
> >>> since the last backup was done.
> >>>
> >>>
> >>> Replication has a verity table tool.
> >>
> >> You can ask a cluster not delete rows.
> >>
> >> You can read at a specific timestamp.
> >>
> >> Or you could create backups during an extended ITBLL. When ITBLL
> >> completes,
> >> verify it on src cluster. Create a table from the increment backups.
> >> Verify
> >> in the restore.
> >>
> >> Etc.
> >>
> >> St.Ack
> >>
> >
> > I can definitely see a benefit of a tool which verifies the data
> collected
> > for a backup which:
> >
> > 1. Is batch in nature
> > 2. Is ad-hoc (not intrinsically run for every backup session)
> > 3. Relies/is-built on existing tooling (snapshots or other
> > verification-like code)
> >
> > Thanks Stack. I think this is some good teasing of requirements from an
> > otherwise very broad and untenable problem statement that we started with
> > (which lead to the knee-jerk).
> >
>
> To be clear, I wasn't listing requirements. I was having trouble with the
> absolute "There is no way to validate correctness of backup in a general
> case." which is then seemingly being used to beat down any request for
> verification tooling/testing that shows backup/restore works properly.
> Good on you Josh,
> S
>


Re: [DISCUSSION] Items to purge from branch-2 before we cut hbase-2.0.0-beta1.

2017-11-01 Thread Vladimir Rodionov
Sean

No, this is not our backup description, it is pre-backup era guideline how
to do backup in HBase, using snapshots, copy table etc.



On Wed, Nov 1, 2017 at 5:28 PM, Sean Busbey <bus...@apache.org> wrote:

> Vlad,
>
> As someone who hasn't spent much time with the backup/restore feature
> yet, could you help me out on getting a foothold?
>
> Which of these Backup/Restore options is it we're specifically talking
> about:
>
> http://hbase.apache.org/book.html#ops.backup
>
>
>
> On Wed, Nov 1, 2017 at 1:33 PM, Vladimir Rodionov
> <vladrodio...@gmail.com> wrote:
> >>> hbase-backup: Not done and it doesn't look like it will be done for
> > beta-1.
> >>>It can come in later in a 2.1 or 3.0 when it is finished.
> >
> > That is not correct. All blockers have been resolved, the last one has a
> > patch which is ready to be commited.
> >
> > Salesforce team has conducted independent testing and found no issues
> with
> > a functionality to my best knowledge.
> >
> > Explain please, Stack.
> >
> >
> >
> > On Wed, Nov 1, 2017 at 10:32 AM, Stack <st...@duboce.net> wrote:
> >
> >> I want to purge the below list of modules, features, and abandoned code
> >> from branch-2 before we make a beta-1 (4-5 weeks I'm thinking). Lets
> >> discuss. Some are already scheduled for removal but listing anyways for
> >> completeness sake. Pushback or other suggestions on what else we should
> >> remove are welcome.
> >>
> >> Distributed Log Replay: Just last week, I heard of someone scheduling
> >> testing of DLR. We need to better message that this never worked and
> was/is
> >> not supported. It's a good idea that we should implement but built on a
> >> different chasis (procedurev2?). Meantime, DLR is still scattered about
> the
> >> codebase as an optional code path. Lets remove it.
> >>
> >> hbase-native-client: It is not done and won't be for 2.0.0. It can come
> in
> >> later when it is done (2.1 or 3.0).
> >>
> >> hbase-prefix-tree: A visionary effort that unfortunately has had no
> uptake
> >> since its original wizard-author moved on. I don't believe it is used
> >> anywhere. It has become a drag as global changes need to be applied in
> here
> >> too by folks who are not up on how it works probably doing damage along
> the
> >> way. This is like DLR in it should be first class but we've not done the
> >> work to keep it up.
> >>
> >> hbase-backup: Not done and it doesn't look like it will be done for
> beta-1.
> >> It can come in later in a 2.1 or 3.0 when it is finished.
> >>
> >> hbase-spark: Purging this makes me tear-up.
> >>
> >> What else?
> >>
> >> Thanks,
> >> St.Ack
> >>
>


Re: [DISCUSSION] Items to purge from branch-2 before we cut hbase-2.0.0-beta1.

2017-11-01 Thread Vladimir Rodionov
There is no way to validate correctness of backup in a general case.

You can restore backup into temp table, but then what? Read rows one-by-one
from temp table and look them up
in a primary table? Won't work, because rows can be deleted or modified
since the last backup was done.

Your results most of the time will be approximate: validation completed,
found  99.5% of rows. Will this satisfies user?

Offtop here. I hope feature requester will explain in corresponding JIRA
what type of *validation* they perform and expect.




On Wed, Nov 1, 2017 at 4:59 PM, Apekshit Sharma <a...@cloudera.com> wrote:

> As for HBASE-19106, when someone says that it's fundamental, i think they
> mean that some kind of validation that backup is correct is necessary, and
> i concur.
> Saying that something wasn't in initial feature list is hardly a
> justification! It's not like the idea was known when initial list was
> planned and was decided not to be done. It's new. And new things can be
> important!
>
>
>
>
> On Wed, Nov 1, 2017 at 4:34 PM, Apekshit Sharma <a...@cloudera.com> wrote:
>
> > Came here just to track anything related to Distributed Log Replay which
> I
> > am trying to purge. But looks like it's another discussion thread about
> > hbase-backup.
> > Am coming here with limited knowledge about the feature (did a review
> > initially once, lost track after). But then, looks like discussion is not
> > about technical aspects of feature, but trust in it.
> >
> > Something which can help get trust in B, or otherwise, is an accurate
> > summary of it as of now. Basically
> > 1) What features are there in 2.0
> > 2) What features are being targeted for 2.1 onwards
> > 3) What testing has been done so far. Not just names...details. For eg.
> > ITBLL w/ 50 node cluster and x,y,z fault tolerences.
> > 4) What tests are planned before 2.0. I think a good basis to judge that
> > would be, will that testing convince Elliot/ Andrew to use that feature
> in
> > their internal clusters.
> > 5) List of existing bugs
> >
> > Once it's there, hopefully everyone agrees that list in (1) is enough and
> > items in (2) are non-critical for basic B
> > 3 and 4 are most important.
> > Missing anything in (5) will be counter-productive.
> > I'd appreciate if the summary is followed by opinions, and not mixed
> > together.
> >
> > Just a suggestion which can help you get right attention.
> > Thanks.
> >
> > -- Appy
> >
> >
> >
> > On Wed, Nov 1, 2017 at 3:33 PM, Vladimir Rodionov <
> vladrodio...@gmail.com>
> > wrote:
> >
> >> >> HBASE-19106 at least is a fundamental
> >>
> >> This new feature was requested 9 days ago (between alpha 3 and alpha 4
> >> releases) It has never been on a list of features we has agreed to
> >> implement for 2.0 release.
> >> When backup started almost 2 years ago, we described what features and
> >> capabilities will be implemented. We have had a discussions before and I
> >> do
> >> not remember any
> >> complaints from community that we lack important functionalities
> >>
> >> You can not point to it as a blocker for 2.0 release, Stack.
> >>
> >> Testing at scale (lack of) - the only real issue I see in B now. The
> >> question: can it justify your willingness to postpone feature till  next
> >> 2.x release, Stack?
> >>
> >> All blockers are resolved, including pending HBASE-17852 patch. All
> >> functionality for 2.0  has been implemented.   Scalability and
> performance
> >> improvements patch is in working
> >> and expected to be ready next week. In any case, this is improvement -
> not
> >> a new feature.
> >>
> >> We have been testing B in our internal QA clusters for months. Others
> >> (SF) have done testing as well. I am pretty confident in implementation.
> >>
> >>
> >>
> >> On Wed, Nov 1, 2017 at 3:15 PM, Josh Elser <els...@apache.org> wrote:
> >>
> >> > On 11/1/17 5:52 PM, Stack wrote:
> >> >
> >> >> On Wed, Nov 1, 2017 at 12:25 PM, Vladimir Rodionov<
> >> vladrodio...@gmail.com
> >> >> >
> >> >> wrote:
> >> >>
> >> >> 1. HBASE-19104 - 19109
> >> >>>
> >> >>> None of them are basic, Stack. These requests came from SF after
> >> >>> discussion
> >> >>> we had with them recently
> >> >>> No single comments is because I was out of 

Re: [DISCUSSION] Items to purge from branch-2 before we cut hbase-2.0.0-beta1.

2017-11-01 Thread Vladimir Rodionov
>> HBASE-19106 at least is a fundamental

This new feature was requested 9 days ago (between alpha 3 and alpha 4
releases) It has never been on a list of features we has agreed to
implement for 2.0 release.
When backup started almost 2 years ago, we described what features and
capabilities will be implemented. We have had a discussions before and I do
not remember any
complaints from community that we lack important functionalities

You can not point to it as a blocker for 2.0 release, Stack.

Testing at scale (lack of) - the only real issue I see in B now. The
question: can it justify your willingness to postpone feature till  next
2.x release, Stack?

All blockers are resolved, including pending HBASE-17852 patch. All
functionality for 2.0  has been implemented.   Scalability and performance
improvements patch is in working
and expected to be ready next week. In any case, this is improvement - not
a new feature.

We have been testing B in our internal QA clusters for months. Others
(SF) have done testing as well. I am pretty confident in implementation.



On Wed, Nov 1, 2017 at 3:15 PM, Josh Elser <els...@apache.org> wrote:

> On 11/1/17 5:52 PM, Stack wrote:
>
>> On Wed, Nov 1, 2017 at 12:25 PM, Vladimir Rodionov<vladrodio...@gmail.com
>> >
>> wrote:
>>
>> 1. HBASE-19104 - 19109
>>>
>>> None of them are basic, Stack. These requests came from SF after
>>> discussion
>>> we had with them recently
>>> No single comments is because I was out of country last week.
>>>
>>> 2. Backup tables are not system ones, they belong to a separate
>>> namespace -
>>> "backup"
>>>
>>> 3. We make no assumptions on assignment order of these tables.
>>>
>>> As for real scale testing and documentation , we still have time before
>>> 2.0GA.  Can't be blocker IMO
>>>
>>>
>>> First off, wrong response.
>>
>> Better would have been pointers to a description of the feature as it
>> stands in branch-2 (a list of JIRAs is insufficient), what is to be done
>> still, and evidence of heavy testing in particular at scale (as Josh
>> reminds us, we agreed to last time backup-in-hbase2 was broached) ending
>> with list of what will be done between here and beta-1 to assuage any
>> concerns that backup is incomplete. As to the issues filed, IMO,
>> HBASE-19106 at least is a fundamental. W/o it, how you even know backup
>> works at anything above toy scale.
>>
>> Pardon my mistake on 'system' tables. I'd made the statement 9 days ago up
>> in HBASE-17852 trying to figure what was going on in the issue and it
>> stood
>> unchallenged (Josh did let me know later that you were traveling).
>>
>> I'm not up for waiting till GA before we decide what is in the release.
>> This DISCUSSION is about deciding now, before beta-1, whats in and whats
>> out. Backup would be a great to have but it is currently on the chopping
>> block. I've tried to spend time figuring what is there and where it stands
>> but I always end up stymied (e.g. see HBASE-17852; see how it starts out;
>> see the patch attached w/ no description of what it comprises or the
>> approach decided upon; and so on). Maybe its me, but hey, unfortunately,
>> its me who is the RM.
>>
>
> As much as it pains me, I can't argue with the lack of confidence via
> testing. While it feels like an eternity ago since we posited on B's
> scale/correctness testing, it's only been 1.5 months. In reality, getting
> to this was delayed by some of the (really good!) FT fixes that Vlad has
> made.
>
> We set the bar for the feature and we missed it; there's not arguing that.
> Yes, it stinks. I see two paths forward: 1) come up with its own release to
> let those downstream use it now (risks withstanding) or 2) shoot for HBase
> 2.1.0. The latter is how we've approached this in the past. Building the
> test needs to happen regardless of the release vehicle.
>
> New issues/feature-requests are always going to come in as people
> experiment with it. I hope to avoid getting bogged down in this -- I
> sincerely doubt that there is any single answer to what is "required" for
> an initial backup and restore implementation. I feel like anything more
> will turn into a battle of opinions. When we bring up the feature again, we
> should make a concerted effort to say "this is the state of the feature,
> with the design choices made, and this the result of our testing for
> correctness." Hopefully much of this is already contained in documentation
> and just needs to be collected/curated.
>
> - Josh
>


Re: [DISCUSSION] Items to purge from branch-2 before we cut hbase-2.0.0-beta1.

2017-11-01 Thread Vladimir Rodionov
>> That's not true, we found issues and filed JIRAs. As to how significant
>> they are or not, I defer to the JIRAs for discussion.

Majority of JIRAs filed are trivial. If you do not agree, please post a
link to bugs you consider serious.
The rest are requests for *new functionality* which SF needs.

And most of them (new features) can be implemented in 1-2 months timeframe.
THESE are  NEW FEATURE requests
If you think that partial table backups and partial table restore are BASIC
functionalities - I do not agree with you.

On Wed, Nov 1, 2017 at 12:20 PM, Andrew Purtell <apurt...@apache.org> wrote:

> > Can you explain please how did you guys manage to file multiple JIRAs
> (trivials mostly) without testing backup/restore?
>
> What the fuck, Vlad.
>
> Obviously we did some testing.
>
> You said "Salesforce team has conducted independent testing and found no
> issues with a functionality to my best knowledge."
>
> That's not true, we found issues and filed JIRAs. As to how significant
> they are or not, I defer to the JIRAs for discussion.
>
>
> On Wed, Nov 1, 2017 at 12:17 PM, Vladimir Rodionov <vladrodio...@gmail.com
> >
> wrote:
>
> > >>I dont' want to get drawn into another unfriendly argument, but this is
> > >>simply not true. We filed a bunch of JIRAs including one with serious
> > >>concerns about scalability.
> >
> > Can you explain please how did you guys manage to file multiple JIRAs
> > (trivials mostly)
> > without testing backup/restore?
> >
> > What you are referring to is not a scalability (its scalable), but
> > *possible* performance issue of incremental backup
> > We have JIRA and partial patch to address this issue HBASE-17825. This
> will
> > definitely make it into beta-1.
> >
> >
> > On Wed, Nov 1, 2017 at 12:00 PM, Stack <st...@duboce.net> wrote:
> >
> > > On Wed, Nov 1, 2017 at 11:53 AM, Josh Elser <els...@apache.org> wrote:
> > >
> > > >
> > > >
> > > > On 11/1/17 1:32 PM, Stack wrote:
> > > >
> > > >> I want to purge the below list of modules, features, and abandoned
> > code
> > > >> from branch-2 before we make a beta-1 (4-5 weeks I'm thinking). Lets
> > > >> discuss. Some are already scheduled for removal but listing anyways
> > for
> > > >> completeness sake. Pushback or other suggestions on what else we
> > should
> > > >> remove are welcome.
> > > >>
> > > >> Distributed Log Replay: Just last week, I heard of someone
> scheduling
> > > >> testing of DLR. We need to better message that this never worked and
> > > >> was/is
> > > >> not supported. It's a good idea that we should implement but built
> on
> > a
> > > >> different chasis (procedurev2?). Meantime, DLR is still scattered
> > about
> > > >> the
> > > >> codebase as an optional code path. Lets remove it.
> > > >>
> > > >> hbase-native-client: It is not done and won't be for 2.0.0. It can
> > come
> > > in
> > > >> later when it is done (2.1 or 3.0).
> > > >>
> > > >
> > > > I think that's fine. It's in a state where people can use it to do
> > basic
> > > > read-write operations. While it would be nice to have this go out to
> > > folks
> > > > who would test it, forcing that to happen via inclusion in a release
> > > isn't
> > > > necessary.
> > > >
> > > >
> > >
> > > Grand.
> > >
> > >
> > >
> > > > hbase-prefix-tree: A visionary effort that unfortunately has had no
> > > uptake
> > > >> since its original wizard-author moved on. I don't believe it is
> used
> > > >> anywhere. It has become a drag as global changes need to be applied
> in
> > > >> here
> > > >> too by folks who are not up on how it works probably doing damage
> > along
> > > >> the
> > > >> way. This is like DLR in it should be first class but we've not done
> > the
> > > >> work to keep it up.
> > > >>
> > > >> hbase-backup: Not done and it doesn't look like it will be done for
> > > >> beta-1.
> > > >> It can come in later in a 2.1 or 3.0 when it is finished.
> > > >>
> > > >
> > > > Ditto to what Vlad said. AFAIK, just the one issue remains:
> > HBASE-178

Re: [DISCUSSION] Items to purge from branch-2 before we cut hbase-2.0.0-beta1.

2017-11-01 Thread Vladimir Rodionov
>>I've not done in-depth research so could be wrong about backup, but from
my
>>perch, I've seen the recent filings against backup, HBASE-19104-19109,
>>which strike me as pretty basic missing facility. The new issues go
without
>>comment (caveat a single question). I've seen no evidence of extensive
test
>>(scale?). The last issue I looked at has backup putting up two system
>>tables with presumptions about assignment order we do not (as yet)
support.
>>I've always had trouble eliciting state of the feature; summary of
>>capability and what is to do are hard to come by.

1. HBASE-19104 - 19109

None of them are basic, Stack. These requests came from SF after discussion
we had with them recently
No single comments is because I was out of country last week.

2. Backup tables are not system ones, they belong to a separate namespace -
"backup"

3. We make no assumptions on assignment order of these tables.

As for real scale testing and documentation , we still have time before
2.0GA.  Can't be blocker IMO

On Wed, Nov 1, 2017 at 12:17 PM, Vladimir Rodionov <vladrodio...@gmail.com>
wrote:

> >>I dont' want to get drawn into another unfriendly argument, but this is
> >>simply not true. We filed a bunch of JIRAs including one with serious
> >>concerns about scalability.
>
> Can you explain please how did you guys manage to file multiple JIRAs
> (trivials mostly)
> without testing backup/restore?
>
> What you are referring to is not a scalability (its scalable), but
> *possible* performance issue of incremental backup
> We have JIRA and partial patch to address this issue HBASE-17825. This
> will definitely make it into beta-1.
>
>
> On Wed, Nov 1, 2017 at 12:00 PM, Stack <st...@duboce.net> wrote:
>
>> On Wed, Nov 1, 2017 at 11:53 AM, Josh Elser <els...@apache.org> wrote:
>>
>> >
>> >
>> > On 11/1/17 1:32 PM, Stack wrote:
>> >
>> >> I want to purge the below list of modules, features, and abandoned code
>> >> from branch-2 before we make a beta-1 (4-5 weeks I'm thinking). Lets
>> >> discuss. Some are already scheduled for removal but listing anyways for
>> >> completeness sake. Pushback or other suggestions on what else we should
>> >> remove are welcome.
>> >>
>> >> Distributed Log Replay: Just last week, I heard of someone scheduling
>> >> testing of DLR. We need to better message that this never worked and
>> >> was/is
>> >> not supported. It's a good idea that we should implement but built on a
>> >> different chasis (procedurev2?). Meantime, DLR is still scattered about
>> >> the
>> >> codebase as an optional code path. Lets remove it.
>> >>
>> >> hbase-native-client: It is not done and won't be for 2.0.0. It can
>> come in
>> >> later when it is done (2.1 or 3.0).
>> >>
>> >
>> > I think that's fine. It's in a state where people can use it to do basic
>> > read-write operations. While it would be nice to have this go out to
>> folks
>> > who would test it, forcing that to happen via inclusion in a release
>> isn't
>> > necessary.
>> >
>> >
>>
>> Grand.
>>
>>
>>
>> > hbase-prefix-tree: A visionary effort that unfortunately has had no
>> uptake
>> >> since its original wizard-author moved on. I don't believe it is used
>> >> anywhere. It has become a drag as global changes need to be applied in
>> >> here
>> >> too by folks who are not up on how it works probably doing damage along
>> >> the
>> >> way. This is like DLR in it should be first class but we've not done
>> the
>> >> work to keep it up.
>> >>
>> >> hbase-backup: Not done and it doesn't look like it will be done for
>> >> beta-1.
>> >> It can come in later in a 2.1 or 3.0 when it is finished.
>> >>
>> >
>> > Ditto to what Vlad said. AFAIK, just the one issue remains: HBASE-17852.
>> > Didn't want to bother you with it while you were head-down on alpha4,
>> > Stack; can you take a look at the explanation Vlad has put up there so
>> we
>> > can try to move it forward? I don't think this needs to be punted out.
>> >
>> >
>> I'll take a look sir.
>>
>>
>>
>> > hbase-spark: Purging this makes me tear-up.
>> >>
>> >
>> > We had a talk about this a while back, didn't we? I forget if we had
>> > consensus about having it follow its own release schedule (rather than
>> > tying it to HBase "internals") -- I think I suggested that anyways :P.
>> >
>> >
>> Sean filed HBASE-18817 <https://issues.apache.org/jira/browse/HBASE-18817>
>> with
>> the result of that discussion.
>>
>>
>>
>> > Now that I'm thinking about it, I wonder if that's actually the proper
>> > route forward for hbase-native-client too...
>> >
>> >
>> And backup?
>>
>> Thanks Josh.
>> St.Ack
>>
>>
>>
>>
>> > What else?
>> >>
>> >> Thanks,
>> >> St.Ack
>> >>
>> >>
>>
>
>


Re: [DISCUSSION] Items to purge from branch-2 before we cut hbase-2.0.0-beta1.

2017-11-01 Thread Vladimir Rodionov
>>I dont' want to get drawn into another unfriendly argument, but this is
>>simply not true. We filed a bunch of JIRAs including one with serious
>>concerns about scalability.

Can you explain please how did you guys manage to file multiple JIRAs
(trivials mostly)
without testing backup/restore?

What you are referring to is not a scalability (its scalable), but
*possible* performance issue of incremental backup
We have JIRA and partial patch to address this issue HBASE-17825. This will
definitely make it into beta-1.


On Wed, Nov 1, 2017 at 12:00 PM, Stack  wrote:

> On Wed, Nov 1, 2017 at 11:53 AM, Josh Elser  wrote:
>
> >
> >
> > On 11/1/17 1:32 PM, Stack wrote:
> >
> >> I want to purge the below list of modules, features, and abandoned code
> >> from branch-2 before we make a beta-1 (4-5 weeks I'm thinking). Lets
> >> discuss. Some are already scheduled for removal but listing anyways for
> >> completeness sake. Pushback or other suggestions on what else we should
> >> remove are welcome.
> >>
> >> Distributed Log Replay: Just last week, I heard of someone scheduling
> >> testing of DLR. We need to better message that this never worked and
> >> was/is
> >> not supported. It's a good idea that we should implement but built on a
> >> different chasis (procedurev2?). Meantime, DLR is still scattered about
> >> the
> >> codebase as an optional code path. Lets remove it.
> >>
> >> hbase-native-client: It is not done and won't be for 2.0.0. It can come
> in
> >> later when it is done (2.1 or 3.0).
> >>
> >
> > I think that's fine. It's in a state where people can use it to do basic
> > read-write operations. While it would be nice to have this go out to
> folks
> > who would test it, forcing that to happen via inclusion in a release
> isn't
> > necessary.
> >
> >
>
> Grand.
>
>
>
> > hbase-prefix-tree: A visionary effort that unfortunately has had no
> uptake
> >> since its original wizard-author moved on. I don't believe it is used
> >> anywhere. It has become a drag as global changes need to be applied in
> >> here
> >> too by folks who are not up on how it works probably doing damage along
> >> the
> >> way. This is like DLR in it should be first class but we've not done the
> >> work to keep it up.
> >>
> >> hbase-backup: Not done and it doesn't look like it will be done for
> >> beta-1.
> >> It can come in later in a 2.1 or 3.0 when it is finished.
> >>
> >
> > Ditto to what Vlad said. AFAIK, just the one issue remains: HBASE-17852.
> > Didn't want to bother you with it while you were head-down on alpha4,
> > Stack; can you take a look at the explanation Vlad has put up there so we
> > can try to move it forward? I don't think this needs to be punted out.
> >
> >
> I'll take a look sir.
>
>
>
> > hbase-spark: Purging this makes me tear-up.
> >>
> >
> > We had a talk about this a while back, didn't we? I forget if we had
> > consensus about having it follow its own release schedule (rather than
> > tying it to HBase "internals") -- I think I suggested that anyways :P.
> >
> >
> Sean filed HBASE-18817 
> with
> the result of that discussion.
>
>
>
> > Now that I'm thinking about it, I wonder if that's actually the proper
> > route forward for hbase-native-client too...
> >
> >
> And backup?
>
> Thanks Josh.
> St.Ack
>
>
>
>
> > What else?
> >>
> >> Thanks,
> >> St.Ack
> >>
> >>
>


[jira] [Created] (HBASE-19149) Improve backup/restore progress indicator

2017-11-01 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-19149:
-

 Summary: Improve backup/restore progress indicator
 Key: HBASE-19149
 URL: https://issues.apache.org/jira/browse/HBASE-19149
 Project: HBase
  Issue Type: Improvement
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
Priority: Normal


The operation progress calculation should be more precise. Currently, we can do 
progress only per M/R Job which may confuse user



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSSION] Items to purge from branch-2 before we cut hbase-2.0.0-beta1.

2017-11-01 Thread Vladimir Rodionov
>> hbase-backup: Not done and it doesn't look like it will be done for
beta-1.
>>It can come in later in a 2.1 or 3.0 when it is finished.

That is not correct. All blockers have been resolved, the last one has a
patch which is ready to be commited.

Salesforce team has conducted independent testing and found no issues with
a functionality to my best knowledge.

Explain please, Stack.



On Wed, Nov 1, 2017 at 10:32 AM, Stack  wrote:

> I want to purge the below list of modules, features, and abandoned code
> from branch-2 before we make a beta-1 (4-5 weeks I'm thinking). Lets
> discuss. Some are already scheduled for removal but listing anyways for
> completeness sake. Pushback or other suggestions on what else we should
> remove are welcome.
>
> Distributed Log Replay: Just last week, I heard of someone scheduling
> testing of DLR. We need to better message that this never worked and was/is
> not supported. It's a good idea that we should implement but built on a
> different chasis (procedurev2?). Meantime, DLR is still scattered about the
> codebase as an optional code path. Lets remove it.
>
> hbase-native-client: It is not done and won't be for 2.0.0. It can come in
> later when it is done (2.1 or 3.0).
>
> hbase-prefix-tree: A visionary effort that unfortunately has had no uptake
> since its original wizard-author moved on. I don't believe it is used
> anywhere. It has become a drag as global changes need to be applied in here
> too by folks who are not up on how it works probably doing damage along the
> way. This is like DLR in it should be first class but we've not done the
> work to keep it up.
>
> hbase-backup: Not done and it doesn't look like it will be done for beta-1.
> It can come in later in a 2.1 or 3.0 when it is finished.
>
> hbase-spark: Purging this makes me tear-up.
>
> What else?
>
> Thanks,
> St.Ack
>


[jira] [Created] (HBASE-19006) Fix TestIncrementalBackupWithBulkLoad under hadoop3

2017-10-13 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-19006:
-

 Summary: Fix TestIncrementalBackupWithBulkLoad under hadoop3
 Key: HBASE-19006
 URL: https://issues.apache.org/jira/browse/HBASE-19006
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
Priority: Critical
 Fix For: 2.0.0-beta-2






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18975) B hadoop3 incompatibility

2017-10-09 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-18975:
-

 Summary: B hadoop3 incompatibility
 Key: HBASE-18975
 URL: https://issues.apache.org/jira/browse/HBASE-18975
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Due to changes in hadoop 3, reflection in BackupDistCp is broken
{code}
java.lang.NoSuchFieldException: inputOptions
  at java.lang.Class.getDeclaredField(Class.java:2070)
  at 
org.apache.hadoop.hbase.backup.mapreduce.MapReduceBackupCopyJob$BackupDistCp.execute(MapReduceBackupCopyJob.java:168)
{code}  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18892) B testing

2017-09-27 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-18892:
-

 Summary: B testing
 Key: HBASE-18892
 URL: https://issues.apache.org/jira/browse/HBASE-18892
 Project: HBase
  Issue Type: Umbrella
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Backup & Restore testing umbrella, for all bugs dicovered



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] Becoming a Committer

2017-09-20 Thread Vladimir Rodionov
>> If you are such a paid professional, sure, it's no problem
>> for you, but you're already getting paid to be here.

You can specify separate levels for paid and non-paid professionals
For example, for paid professional, at least one meetup/conference
presentation is a must, one major feature is a must
Otherwise, you discriminate those who works on HBase (or other OSS
projects) professionally.

As I have already mentioned, vaguely specified subjective criteria leaves
too much room for various speculations, questions from one side and abuse
from another side.

-Vlad



On Wed, Sep 20, 2017 at 4:30 PM, Andrew Purtell <apurt...@apache.org> wrote:

> Again, the presumption of bad intent. It's poisonous, IMHO. I think this
> viewpoint needs to be justified. What actions have the HBase PMC taken, or
> not taken, that cause you to suspect this? Experiences in other communities
> where there has been bad faith are regrettable but not germane, unless the
> same actors are here in this PMC, in which case I think the PMC would
> welcome your concerns, on private@ if need be. Likewise, if the HBase PMC
> has taken suspect actions I think discussion would be welcome either here
> or on private@, to address specific concerns.
>
> If the community would like to press ahead and address a perceived problem
> of lack of objective criteria, fine, but then I'd like to see that criteria
> well specified, and every candidate would need to meet it without
> exception. I don't think that is particularly healthy for the project. The
> criteria we will come up with will strongly favor paid professionals
> because they are the ones who will have the (paid) time to post numbers to
> meet objective criteria such as number of commits, LOC changed, number of
> JIRAs, and such. If you are such a paid professional, sure, it's no problem
> for you, but you're already getting paid to be here.
>
>
> On Wed, Sep 20, 2017 at 3:57 PM, Vladimir Rodionov <vladrodio...@gmail.com
> >
> ​ ​
> wrote:
>
> > Any subjective criteria, such as "acting like a committer" open wide room
> > for а power abuse of PMC members.
> >
> > My 2c
> >
> > -Vlad
> >
> > On Wed, Sep 20, 2017 at 3:05 PM, Andrew Purtell <apurt...@apache.org>
> > wrote:
> >
> > > By the way I think "act like a committer and you'll become a committer"
> > is
> > > pretty good advice for anyone looking to enter into participation in an
> > > open source community, and a reasonable yardstick to judge candidates
> who
> > > have been nominated. I also have no objection to documenting a list of
> > > favorable attributes. I would hope every PMCer voting on candidates
> will
> > be
> > > fair and remember how they judged previous candidates, and be
> objective.
> > I
> > > give everyone the presumption of acting in good faith and that's enough
> > > (for me). What makes me allergic to this discussion is words like
> > > "prerequisite" and the implication that our current process has been
> > unfair
> > > or is not aligned with the Apache Way. I think that case should be made
> > if
> > > we need to make it.
> > >
> > >
> > > On Wed, Sep 20, 2017 at 2:54 PM, Andrew Purtell <apurt...@apache.org>
> > > wrote:
> > >
> > > > > will lead to folks motivated wrongly, similar to oft maligned
> "resume
> > > > driven development?"
> > > >
> > > > I find the need to have this discussion mildly offensive. Have we
> been
> > > > unfair in offering committership? Do you have a specific example of
> > > > something that looked improper? Can you name a committer whom you
> think
> > > was
> > > > offered committership without sufficient merit? Can you name any
> action
> > > we
> > > > have taken that smacks of "resume driven development"?
> > > >
> > > > I take the opposite view. I think the presumption of good faith in
> some
> > > > communities has been ground down by inter-vendor conflicts and as a
> > > result
> > > > they are very litigious and everything must be super specified and
> "by
> > > the
> > > > book" according to some formal process that drains the spirit of the
> > > Apache
> > > > Way and is corrosive to everything that holds open source communities
> > > > together. I don't think importing these ways to the HBase community
> is
> > > > either necessary or wise at this time.
> > > >
> > > > I'd like nom

Re: [DISCUSS] Becoming a Committer

2017-09-20 Thread Vladimir Rodionov
The most important criterion that make sense is the level of involvement in
HBase development:

1. Number of JIRAs resolved
2. Number of major features implemented
3. Number of new LOC added to code base

This translates immediately to number of hours (days, weeks, months and
even years) spent on HBase development

Another one that make sense as well is community involvement :

1. Presentations, tech talks at meetups and conferences
2. Activity in user and dev mailing lists

PMC members can score each category and award status to a person with a
maximum score.

That is truly quantifiable approach with almost no room for power abuse

-Vlad









On Wed, Sep 20, 2017 at 3:57 PM, Vladimir Rodionov <vladrodio...@gmail.com>
wrote:

> Any subjective criteria, such as "acting like a committer" open wide room
> for а power abuse of PMC members.
>
> My 2c
>
> -Vlad
>
> On Wed, Sep 20, 2017 at 3:05 PM, Andrew Purtell <apurt...@apache.org>
> wrote:
>
>> By the way I think "act like a committer and you'll become a committer" is
>> pretty good advice for anyone looking to enter into participation in an
>> open source community, and a reasonable yardstick to judge candidates who
>> have been nominated. I also have no objection to documenting a list of
>> favorable attributes. I would hope every PMCer voting on candidates will
>> be
>> fair and remember how they judged previous candidates, and be objective. I
>> give everyone the presumption of acting in good faith and that's enough
>> (for me). What makes me allergic to this discussion is words like
>> "prerequisite" and the implication that our current process has been
>> unfair
>> or is not aligned with the Apache Way. I think that case should be made if
>> we need to make it.
>>
>>
>> On Wed, Sep 20, 2017 at 2:54 PM, Andrew Purtell <apurt...@apache.org>
>> wrote:
>>
>> > > will lead to folks motivated wrongly, similar to oft maligned "resume
>> > driven development?"
>> >
>> > I find the need to have this discussion mildly offensive. Have we been
>> > unfair in offering committership? Do you have a specific example of
>> > something that looked improper? Can you name a committer whom you think
>> was
>> > offered committership without sufficient merit? Can you name any action
>> we
>> > have taken that smacks of "resume driven development"?
>> >
>> > I take the opposite view. I think the presumption of good faith in some
>> > communities has been ground down by inter-vendor conflicts and as a
>> result
>> > they are very litigious and everything must be super specified and "by
>> the
>> > book" according to some formal process that drains the spirit of the
>> Apache
>> > Way and is corrosive to everything that holds open source communities
>> > together. I don't think importing these ways to the HBase community is
>> > either necessary or wise at this time.
>> >
>> > I'd like nominations for committership and PMC to be addressed on a case
>> > by case basis. Perhaps we should have greater transparency in the
>> welcome
>> > announcement.
>> >
>> >
>> > On Wed, Sep 20, 2017 at 11:48 AM, Mike Drob <md...@apache.org> wrote:
>> >
>> >>  Hi folks,
>> >>
>> >> I've been chatting with folks off and on about this for a while, and
>> was
>> >> told that this made sense as a discussion on the dev@ list.
>> >>
>> >> How does the PMC select folks for committership? The most common
>> answer is
>> >> that folks should 'act like a committer' but that's painfully nebulous
>> and
>> >> easy to get sidetracked onto other topics. The problem is compounded
>> >> because what may be great on one project is inconsistently applied on
>> >> other
>> >> projects in the ASF, and yet we are all very tightly coupled as
>> >> communities
>> >> and as project dependencies.
>> >>
>> >> Ideally, this is something that we can document in the book. Misty
>> gently
>> >> pointed out http://hbase.apache.org/book.h
>> tml#_guide_for_hbase_committers
>> >> but
>> >> also noted that it's for what happens after somebody becomes a
>> committer.
>> >> Still, if the standard is "act like one until you become one" then it's
>> >> useful reading for people. Also, there doesn't seem to be any
>> guidelines
>> >> like this for PMC.

Re: [DISCUSS] Becoming a Committer

2017-09-20 Thread Vladimir Rodionov
Any subjective criteria, such as "acting like a committer" open wide room
for а power abuse of PMC members.

My 2c

-Vlad

On Wed, Sep 20, 2017 at 3:05 PM, Andrew Purtell  wrote:

> By the way I think "act like a committer and you'll become a committer" is
> pretty good advice for anyone looking to enter into participation in an
> open source community, and a reasonable yardstick to judge candidates who
> have been nominated. I also have no objection to documenting a list of
> favorable attributes. I would hope every PMCer voting on candidates will be
> fair and remember how they judged previous candidates, and be objective. I
> give everyone the presumption of acting in good faith and that's enough
> (for me). What makes me allergic to this discussion is words like
> "prerequisite" and the implication that our current process has been unfair
> or is not aligned with the Apache Way. I think that case should be made if
> we need to make it.
>
>
> On Wed, Sep 20, 2017 at 2:54 PM, Andrew Purtell 
> wrote:
>
> > > will lead to folks motivated wrongly, similar to oft maligned "resume
> > driven development?"
> >
> > I find the need to have this discussion mildly offensive. Have we been
> > unfair in offering committership? Do you have a specific example of
> > something that looked improper? Can you name a committer whom you think
> was
> > offered committership without sufficient merit? Can you name any action
> we
> > have taken that smacks of "resume driven development"?
> >
> > I take the opposite view. I think the presumption of good faith in some
> > communities has been ground down by inter-vendor conflicts and as a
> result
> > they are very litigious and everything must be super specified and "by
> the
> > book" according to some formal process that drains the spirit of the
> Apache
> > Way and is corrosive to everything that holds open source communities
> > together. I don't think importing these ways to the HBase community is
> > either necessary or wise at this time.
> >
> > I'd like nominations for committership and PMC to be addressed on a case
> > by case basis. Perhaps we should have greater transparency in the welcome
> > announcement.
> >
> >
> > On Wed, Sep 20, 2017 at 11:48 AM, Mike Drob  wrote:
> >
> >>  Hi folks,
> >>
> >> I've been chatting with folks off and on about this for a while, and was
> >> told that this made sense as a discussion on the dev@ list.
> >>
> >> How does the PMC select folks for committership? The most common answer
> is
> >> that folks should 'act like a committer' but that's painfully nebulous
> and
> >> easy to get sidetracked onto other topics. The problem is compounded
> >> because what may be great on one project is inconsistently applied on
> >> other
> >> projects in the ASF, and yet we are all very tightly coupled as
> >> communities
> >> and as project dependencies.
> >>
> >> Ideally, this is something that we can document in the book. Misty
> gently
> >> pointed out http://hbase.apache.org/book.html#_guide_for_hbase_
> committers
> >> but
> >> also noted that it's for what happens after somebody becomes a
> committer.
> >> Still, if the standard is "act like one until you become one" then it's
> >> useful reading for people. Also, there doesn't seem to be any guidelines
> >> like this for PMC.
> >>
> >> Is the list of prerequisites possible to articulate, or will it always
> >> boil
> >> down to "intangibles?" Is there a concern that providing a checklist
> >> (perhaps a list of items necessary, but not sufficient) will lead to
> folks
> >> motivated wrongly, similar to oft maligned "resume driven development?"
> >>
> >> I'll kick off the discussion by saying that my personal yardstick of
> "Can
> >> I
> >> trust this person's judgement regarding code/reviews" is probably too
> >> vague
> >> to be useful, and even worse is impossible for others to apply.
> >>
> >> Curiously,
> >> Mike
> >>
> >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >- A23, Crosstalk
> >
>
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>- A23, Crosstalk
>


[jira] [Created] (HBASE-18843) Add DistCp support to incremental backup with bulk loading

2017-09-18 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-18843:
-

 Summary: Add DistCp support to incremental backup with bulk loading
 Key: HBASE-18843
 URL: https://issues.apache.org/jira/browse/HBASE-18843
 Project: HBase
  Issue Type: Improvement
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (HBASE-14417) Incremental backup and bulk loading

2017-09-18 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov reopened HBASE-14417:
---
  Assignee: Vladimir Rodionov  (was: Ted Yu)

> Incremental backup and bulk loading
> ---
>
> Key: HBASE-14417
> URL: https://issues.apache.org/jira/browse/HBASE-14417
> Project: HBase
>  Issue Type: New Feature
>    Reporter: Vladimir Rodionov
>    Assignee: Vladimir Rodionov
>Priority: Blocker
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, 
> 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, 
> 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, 
> 14417-tbl-ext.v23.txt, 14417-tbl-ext.v24.txt, 14417-tbl-ext.v9.txt, 
> 14417.v11.txt, 14417.v13.txt, 14417.v1.txt, 14417.v21.txt, 14417.v23.txt, 
> 14417.v24.txt, 14417.v25.txt, 14417.v2.txt, 14417.v6.txt
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading 
> bypasses WALs for obvious reasons, breaking incremental backups. The only way 
> to continue backups after bulk loading is to create new full backup of a 
> table. This may not be feasible for customers who do bulk loading regularly 
> (say, every day).
> Here is the review board (out of date):
> https://reviews.apache.org/r/54258/
> In order not to miss the hfiles which are loaded into region directories in a 
> situation where postBulkLoadHFile() hook is not called (bulk load being 
> interrupted), we record hfile names thru preCommitStoreFile() hook.
> At time of incremental backup, we check the presence of such hfiles. If they 
> are present, they become part of the incremental backup image.
> Here is review board:
> https://reviews.apache.org/r/57790/
> Google doc for design:
> https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] Plan for Distributed testing of Backup and Restore

2017-09-12 Thread Vladimir Rodionov
Yes, we have already some IT, so will need to upgrade it for scale testing.

On Tue, Sep 12, 2017 at 11:28 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> bq. we need a test tool similar to ITBLL
>
> How about making the following such a tool ?
>
> hbase-it/src/test/java/org/apache/hadoop/hbase/
> IntegrationTestBackupRestore.java
>
> On Tue, Sep 12, 2017 at 11:25 AM, Vladimir Rodionov <
> vladrodio...@gmail.com>
> wrote:
>
> > >> Vlad: I'm obviously curious to see what you think about this stuff, in
> > addition to what you already had in mind :)
> >
> > Yes, I think that we need a test tool similar to ITBLL. Btw, making
> backup
> > working in challenging conditions was not a goal of FT design, correct
> > failure handling was a goal.
> >
> > On Tue, Sep 12, 2017 at 9:53 AM, Josh Elser <els...@apache.org> wrote:
> >
> > > Thanks for the quick feedback!
> > >
> > > On 9/12/17 12:36 PM, Stack wrote:
> > >
> > >> On Tue, Sep 12, 2017 at 9:33 AM, Andrew Purtell <
> > andrew.purt...@gmail.com
> > >> >
> > >> wrote:
> > >>
> > >> I think those are reasonable criteria Josh.
> > >>>
> > >>> What I would like to see is something like "we ran ITBLL (or custom
> > >>> generator with similar correctness validation if you prefer) on a dev
> > >>> cluster (5-10 nodes) for 24 hours with server killing chaos agents
> > >>> active,
> > >>> attempted 1,440 backups (one per minute), of which 1,000 succeeded
> and
> > >>> 100%
> > >>> if these were successfully restored and validated." This implies your
> > >>> points on automation and no manual intervention. Maybe the number of
> > >>> successful backups under challenging conditions will be lower. Point
> is
> > >>> they demonstrate we can rely on it even when a cluster is partially
> > >>> unhealthy, which in production is often the normal order of affairs.
> > >>>
> > >>>
> > >>>
> > > I like it. I hadn't thought about stressing quite this aggressively,
> but
> > > now that I think about it, sounds like a great plan. Having some
> ballpark
> > > measure to quantify the cost of a "backup-heavy" workload would be cool
> > in
> > > addition to seeing how the system reacts in unexpected manners.
> > >
> > > Sounds good to me.
> > >>
> > >> How will you test the restore aspect? After 1k (or whatever makes
> sense)
> > >> incremental backups over the life of the chaos, could you restore and
> > >> validate that the table had all expected data in place.
> > >>
> > >
> > > Exactly. My thinking was that, at any point, we should be able to do a
> > > restore and validate. Maybe something like: every Nth ITBLL iteration,
> > make
> > > a new backup point, restore a previous backup point, verify, restore to
> > > newest backup point. The previous backup point should be a full or
> > > incremental point.
> > >
> > > Vlad: I'm obviously curious to see what you think about this stuff, in
> > > addition to what you already had in mind :)
> > >
> >
>


Re: [DISCUSS] Plan for Distributed testing of Backup and Restore

2017-09-12 Thread Vladimir Rodionov
>> Vlad: I'm obviously curious to see what you think about this stuff, in
addition to what you already had in mind :)

Yes, I think that we need a test tool similar to ITBLL. Btw, making backup
working in challenging conditions was not a goal of FT design, correct
failure handling was a goal.

On Tue, Sep 12, 2017 at 9:53 AM, Josh Elser  wrote:

> Thanks for the quick feedback!
>
> On 9/12/17 12:36 PM, Stack wrote:
>
>> On Tue, Sep 12, 2017 at 9:33 AM, Andrew Purtell > >
>> wrote:
>>
>> I think those are reasonable criteria Josh.
>>>
>>> What I would like to see is something like "we ran ITBLL (or custom
>>> generator with similar correctness validation if you prefer) on a dev
>>> cluster (5-10 nodes) for 24 hours with server killing chaos agents
>>> active,
>>> attempted 1,440 backups (one per minute), of which 1,000 succeeded and
>>> 100%
>>> if these were successfully restored and validated." This implies your
>>> points on automation and no manual intervention. Maybe the number of
>>> successful backups under challenging conditions will be lower. Point is
>>> they demonstrate we can rely on it even when a cluster is partially
>>> unhealthy, which in production is often the normal order of affairs.
>>>
>>>
>>>
> I like it. I hadn't thought about stressing quite this aggressively, but
> now that I think about it, sounds like a great plan. Having some ballpark
> measure to quantify the cost of a "backup-heavy" workload would be cool in
> addition to seeing how the system reacts in unexpected manners.
>
> Sounds good to me.
>>
>> How will you test the restore aspect? After 1k (or whatever makes sense)
>> incremental backups over the life of the chaos, could you restore and
>> validate that the table had all expected data in place.
>>
>
> Exactly. My thinking was that, at any point, we should be able to do a
> restore and validate. Maybe something like: every Nth ITBLL iteration, make
> a new backup point, restore a previous backup point, verify, restore to
> newest backup point. The previous backup point should be a full or
> incremental point.
>
> Vlad: I'm obviously curious to see what you think about this stuff, in
> addition to what you already had in mind :)
>


Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2017-09-11 Thread Vladimir Rodionov
Stack, Andrew

We have doc blocker and (partially) HBASE-15227: two sub-tasks remain: one
is unit test (you can't call it blocker)
and another for FT support during incremental backup with bulk loading. The
latter one have been probably addressed
already in other HBASE-15527 subtasks. I have to reassess this.

That is mostly it. Yes, We have not done real testing with real data on a
real cluster yet, except QA  testing on a small OpenStack
cluster (10 nodes). That is our probably the biggest minus right now. I
would like to inform community that this week we are going to start
full scale testing with reasonably sized data sets.

The recent committed improvements, such as ability to run backup/restore on
a particular Yarn pool (queue) allows precise control
of a cluster utilization during operation (not to interfere much with a
regular cluster operations). Another one -
 converting WAL on the fly to HFiles - significantly improves storage usage
on a backup site.

My plan is to finish HBASE-17825 (further performance optimizations). This
will cut down number of MR jobs during incremental backup
from 2*N to 2  (N - number of tables). That will probably take 2-3 more days

Then:

1. Address remaining two sub-tasks in HBASE-15227
2. Update Release notes for all relevant B JIRAs
3. Work on doc

After that we can call it feature full complete. Taking into account the
vast amount of efforts
spent on this feature (including QA testing) I would say that we are
probably quite close to GA right now, but only
after real testing is done (I do not anticipate significant issues, except
probably correct failure handling).

On a feature itself. We provide tools to fully automate backup and restore
tasks: create backup (full and incremental), restore
from image, delete backups, merge backups, history, history per table,
backup set management.

Hopefully, my write up addresses at least some of your concerns.

-Vlad

On Sun, Sep 10, 2017 at 6:27 AM, Josh Elser <els...@apache.org> wrote:

> On Sat, Sep 9, 2017 at 7:04 PM, stack <saint@gmail.com> wrote:
> > In spite of repeated requests for eng summary of state of this feature --
> > summary of what is in 2.0, what is not, what the capabilities are, how
> well
> > it has been tested and at what scale -- all I get, when the requests are
> > not ignored, are pointers to lists of ill-describing jiras and some
> pending
> > user facing doc update.
>
> Yes, this is a problem. We, especially you as RM, shouldn't have
> outstanding questions as to the quality/state of B
>
> > For other features, mob or region server groups, I know that they have
> been
> > running at scale in production for as much as a year and more. I have
> some
> > confidence these items basically work.  For backup/restore I have no such
> > sense even after spending time in review and trying to use the feature.
>
> I can attest to the feature being tested on small clusters. I'm not
> sure about larger than 10node tests. If this is less a worry and more
> a veto, let's get some criteria on the kind of testing you're looking
> for to avoid having to rehash later.
>
> Do we have any kind of integration tests in the codebase now that can
> help increase Stack's confidence?
>
> > As release manager, I have say over what makes it into a release.  Unless
> > the work is done to convince me that backup/restore is more than a lump
> of
> > code and a few unit tests that can pass on some fellows laptop, I am
> going
> > to kick it out of branch-2.  Let the feature harden more in master branch
> > before it ships in a release.
>
> While it was a few months ago now, I can also attest to this being
> more than some unit tests (I think I looked at it after I saw you last
> down in the weeds).
>
> I do worry about trying to remove it at this state.
>
> * Do you consider the B code in the repository implicitly harmful?
> Is there harm in shipping with docs capturing the concern.
> * Trying to revert all relevant pieces from branch-2 is non-trivial.
> * I would feel quite dejected if some feature I spent a year+ working
> on (*not* making assertions on my perception of quality) was removed
> from the release line it was expected to land.
>
> > S
> >
> > On Sep 8, 2017 10:59 PM, "Vladimir Rodionov" <vladrodio...@gmail.com>
> wrote:
> >
> >> >> Have I grasped the state of things correctly, Vlad?
> >>
> >> Josh, the only thing which is still pending is doc update. All other
> >> features are good to have but not a blockers for 2.0 release.
> >>
> >> -Vlad
> >>
> >> On Fri, Sep 8, 2017 at 10:42 PM, Vladimir Rodionov <
> vladrodio...@gmail.com
> >> >
> >> wrote:
> >>
> >&

Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2017-09-09 Thread Vladimir Rodionov
That was I thought. Thanks. Can you tell me that you are not considering AM
v2 as an unfinished and untested feature? The question to Stack as well.

On Sat, Sep 9, 2017 at 5:53 PM, Andrew Purtell <andrew.purt...@gmail.com>
wrote:

> No all I have to do is pay attention to words you have written yourself in
> emails and on JIRA. Don't argue with us not to believe our lying eyes,
> consider finishing the work. I'll be happy to try it out when you indicate
> it can work if anything happens to fail on the cluster at the time. Until
> then there are a lot of other things need doing first.
>
>
> On Sep 9, 2017, at 5:25 PM, Vladimir Rodionov <vladrodio...@gmail.com>
> wrote:
>
> >>> but the impression we have is it is unfinished and untested.
> > To make a conclusion that "feature is not finished and tested"  you have
> > had to test it at least.
> > Andrew, If you have discovered issues, why wouldn't you open bug JIRAs?
> >
> > -Vlad
> >
> > On Sat, Sep 9, 2017 at 4:40 PM, Andrew Purtell <andrew.purt...@gmail.com
> >
> > wrote:
> >
> >> For what it's worth, I think AMv2 is the main reason to have a 2.0 in
> the
> >> first place, so I would both agree it needs a lot more testing and yet I
> >> would want us to have a 2.0 release as the vehicle for getting that to
> >> happen. For other features without testing from a number of parties or
> at
> >> scale the value proposition is less clear and it's fine by me for the
> RM to
> >> set them aside for future releases.
> >>
> >> Also, I can relay that there is some interest where I work in utilizing
> >> HBASE-7912 but the impression we have is it is unfinished and untested.
> So
> >> for now we are ignoring it and continuing with home grown solutions.
> Part
> >> of the problem is fault tolerance was left to the last phase(s) and yet
> it
> >> is an essential property for adoption for serious work. The best way to
> >> resolve this IMHO is for the developers of this feature to complete
> those
> >> unfinished JIRAs, especially concerning resilience to failures.
> >>
> >>
> >>> On Sep 9, 2017, at 4:11 PM, Vladimir Rodionov <vladrodio...@gmail.com>
> >> wrote:
> >>>
> >>> Hmm, the next on your list (of kicked out from branch v2) should be AM
> >> v2 I
> >>> presume?
> >>>
> >>> -Vlad
> >>>
> >>>> On Sat, Sep 9, 2017 at 4:04 PM, stack <saint@gmail.com> wrote:
> >>>>
> >>>> In spite of repeated requests for eng summary of state of this feature
> >> --
> >>>> summary of what is in 2.0, what is not, what the capabilities are, how
> >> well
> >>>> it has been tested and at what scale -- all I get, when the requests
> are
> >>>> not ignored, are pointers to lists of ill-describing jiras and some
> >> pending
> >>>> user facing doc update.
> >>>>
> >>>> For other features, mob or region server groups, I know that they have
> >> been
> >>>> running at scale in production for as much as a year and more. I have
> >> some
> >>>> confidence these items basically work.  For backup/restore I have no
> >> such
> >>>> sense even after spending time in review and trying to use the
> feature.
> >>>>
> >>>> As release manager, I have say over what makes it into a release.
> >> Unless
> >>>> the work is done to convince me that backup/restore is more than a
> lump
> >> of
> >>>> code and a few unit tests that can pass on some fellows laptop, I am
> >> going
> >>>> to kick it out of branch-2.  Let the feature harden more in master
> >> branch
> >>>> before it ships in a release.
> >>>>
> >>>> S
> >>>>
> >>>> On Sep 8, 2017 10:59 PM, "Vladimir Rodionov" <vladrodio...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>>>> Have I grasped the state of things correctly, Vlad?
> >>>>>
> >>>>> Josh, the only thing which is still pending is doc update. All other
> >>>>> features are good to have but not a blockers for 2.0 release.
> >>>>>
> >>>>> -Vlad
> >>>>>
> >>>>> On Fri, Sep 8, 2017 at 10:42 PM, Vladimir Rodionov <
> >>>> vladrodio...@gmail.com
> >>>>&

Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2017-09-09 Thread Vladimir Rodionov
>> but the impression we have is it is unfinished and untested.
To make a conclusion that "feature is not finished and tested"  you have
had to test it at least.
Andrew, If you have discovered issues, why wouldn't you open bug JIRAs?

-Vlad

On Sat, Sep 9, 2017 at 4:40 PM, Andrew Purtell <andrew.purt...@gmail.com>
wrote:

> For what it's worth, I think AMv2 is the main reason to have a 2.0 in the
> first place, so I would both agree it needs a lot more testing and yet I
> would want us to have a 2.0 release as the vehicle for getting that to
> happen. For other features without testing from a number of parties or at
> scale the value proposition is less clear and it's fine by me for the RM to
> set them aside for future releases.
>
> Also, I can relay that there is some interest where I work in utilizing
> HBASE-7912 but the impression we have is it is unfinished and untested. So
> for now we are ignoring it and continuing with home grown solutions. Part
> of the problem is fault tolerance was left to the last phase(s) and yet it
> is an essential property for adoption for serious work. The best way to
> resolve this IMHO is for the developers of this feature to complete those
> unfinished JIRAs, especially concerning resilience to failures.
>
>
> > On Sep 9, 2017, at 4:11 PM, Vladimir Rodionov <vladrodio...@gmail.com>
> wrote:
> >
> > Hmm, the next on your list (of kicked out from branch v2) should be AM
> v2 I
> > presume?
> >
> > -Vlad
> >
> >> On Sat, Sep 9, 2017 at 4:04 PM, stack <saint@gmail.com> wrote:
> >>
> >> In spite of repeated requests for eng summary of state of this feature
> --
> >> summary of what is in 2.0, what is not, what the capabilities are, how
> well
> >> it has been tested and at what scale -- all I get, when the requests are
> >> not ignored, are pointers to lists of ill-describing jiras and some
> pending
> >> user facing doc update.
> >>
> >> For other features, mob or region server groups, I know that they have
> been
> >> running at scale in production for as much as a year and more. I have
> some
> >> confidence these items basically work.  For backup/restore I have no
> such
> >> sense even after spending time in review and trying to use the feature.
> >>
> >> As release manager, I have say over what makes it into a release.
> Unless
> >> the work is done to convince me that backup/restore is more than a lump
> of
> >> code and a few unit tests that can pass on some fellows laptop, I am
> going
> >> to kick it out of branch-2.  Let the feature harden more in master
> branch
> >> before it ships in a release.
> >>
> >> S
> >>
> >> On Sep 8, 2017 10:59 PM, "Vladimir Rodionov" <vladrodio...@gmail.com>
> >> wrote:
> >>
> >>>>> Have I grasped the state of things correctly, Vlad?
> >>>
> >>> Josh, the only thing which is still pending is doc update. All other
> >>> features are good to have but not a blockers for 2.0 release.
> >>>
> >>> -Vlad
> >>>
> >>> On Fri, Sep 8, 2017 at 10:42 PM, Vladimir Rodionov <
> >> vladrodio...@gmail.com
> >>>>
> >>> wrote:
> >>>
> >>>>>> What testing and at what
> >>>>>> scale has testing been done?
> >>>>
> >>>> Do we have have that for other features?
> >>>>
> >>>>
> >>>> On Fri, Sep 8, 2017 at 10:41 PM, Vladimir Rodionov <
> >>> vladrodio...@gmail.com
> >>>>> wrote:
> >>>>
> >>>>>>> It asks: "How do I figure what of backup/restore feature is going
> >> to
> >>>>> be in
> >>>>>>> hbase-2.0.0?
> >>>>>
> >>>>> Hmm, wait for doc update.
> >>>>>
> >>>>>
> >>>>>> On Fri, Sep 8, 2017 at 2:39 PM, Stack <st...@duboce.net> wrote:
> >>>>>>
> >>>>>> HBASE-14414 is a JIRA with a list of random seeming issues w/
> >>>>>> non-descript
> >>>>>> summaries: "Add nonce support to TableBackupProcedure, BackupID must
> >>>>>> include backup set name, ...". The last comment in that issue is
> from
> >>>>>> July.
> >>>>>> It asks: "How do I figure what of backup/restore feature is going to
> >> be
> >>>

Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2017-09-09 Thread Vladimir Rodionov
Hmm, the next on your list (of kicked out from branch v2) should be AM v2 I
presume?

-Vlad

On Sat, Sep 9, 2017 at 4:04 PM, stack <saint@gmail.com> wrote:

> In spite of repeated requests for eng summary of state of this feature --
> summary of what is in 2.0, what is not, what the capabilities are, how well
> it has been tested and at what scale -- all I get, when the requests are
> not ignored, are pointers to lists of ill-describing jiras and some pending
> user facing doc update.
>
> For other features, mob or region server groups, I know that they have been
> running at scale in production for as much as a year and more. I have some
> confidence these items basically work.  For backup/restore I have no such
> sense even after spending time in review and trying to use the feature.
>
> As release manager, I have say over what makes it into a release.  Unless
> the work is done to convince me that backup/restore is more than a lump of
> code and a few unit tests that can pass on some fellows laptop, I am going
> to kick it out of branch-2.  Let the feature harden more in master branch
> before it ships in a release.
>
> S
>
> On Sep 8, 2017 10:59 PM, "Vladimir Rodionov" <vladrodio...@gmail.com>
> wrote:
>
> > >> Have I grasped the state of things correctly, Vlad?
> >
> > Josh, the only thing which is still pending is doc update. All other
> > features are good to have but not a blockers for 2.0 release.
> >
> > -Vlad
> >
> > On Fri, Sep 8, 2017 at 10:42 PM, Vladimir Rodionov <
> vladrodio...@gmail.com
> > >
> > wrote:
> >
> > > >> What testing and at what
> > > >> scale has testing been done?
> > >
> > > Do we have have that for other features?
> > >
> > >
> > > On Fri, Sep 8, 2017 at 10:41 PM, Vladimir Rodionov <
> > vladrodio...@gmail.com
> > > > wrote:
> > >
> > >> >> It asks: "How do I figure what of backup/restore feature is going
> to
> > >> be in
> > >> >>hbase-2.0.0?
> > >>
> > >> Hmm, wait for doc update.
> > >>
> > >>
> > >> On Fri, Sep 8, 2017 at 2:39 PM, Stack <st...@duboce.net> wrote:
> > >>
> > >>> HBASE-14414 is a JIRA with a list of random seeming issues w/
> > >>> non-descript
> > >>> summaries: "Add nonce support to TableBackupProcedure, BackupID must
> > >>> include backup set name, ...". The last comment in that issue is from
> > >>> July.
> > >>> It asks: "How do I figure what of backup/restore feature is going to
> be
> > >>> in
> > >>> hbase-2.0.0? Thanks Vladimir Rodionov
> > >>> <https://issues.apache.org/jira/secure/ViewProfile.jspa?
> name=vrodionov
> > >>> >."
> > >>> to which there is no answer.  Doc update is TODO.
> > >>>
> > >>> Where is the summary of the capability in hbase-2? What testing and
> at
> > >>> what
> > >>> scale has testing been done? Is this 'stable or experimental'? If I
> > can't
> > >>> get basic info on this feature though I ask repeatedly, what hope
> does
> > >>> the
> > >>> poor old operator have?
> > >>>
> > >>> St.Ack
> > >>>
> > >>>
> > >>> On Fri, Sep 8, 2017 at 1:59 PM, Vladimir Rodionov <
> > >>> vladrodio...@gmail.com>
> > >>> wrote:
> > >>>
> > >>> > HBASE-14414
> > >>> >
> > >>> > On Fri, Sep 8, 2017 at 1:14 PM, Stack <st...@duboce.net> wrote:
> > >>> >
> > >>> > > Where do I go to get the current status of this feature? Looking
> in
> > >>> JIRA
> > >>> > I
> > >>> > > see loads of issues open against backup including some against
> > >>> > hbase-2.0.0
> > >>> > > and no progress being made that I can discern.
> > >>> > >
> > >>> > > Thanks,
> > >>> > > S
> > >>> > >
> > >>> > >
> > >>> > >
> > >>> > > On Wed, Nov 23, 2016 at 8:52 AM, Stack <st...@duboce.net> wrote:
> > >>> > >
> > >>> > > > On Tue, Nov 22, 2016 at 6:48 PM, Stack <st...@duboce.net>
> wrote:
> > >>> > 

Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2017-09-09 Thread Vladimir Rodionov
>> Have I grasped the state of things correctly, Vlad?

Josh, the only thing which is still pending is doc update. All other
features are good to have but not a blockers for 2.0 release.

-Vlad

On Fri, Sep 8, 2017 at 10:42 PM, Vladimir Rodionov <vladrodio...@gmail.com>
wrote:

> >> What testing and at what
> >> scale has testing been done?
>
> Do we have have that for other features?
>
>
> On Fri, Sep 8, 2017 at 10:41 PM, Vladimir Rodionov <vladrodio...@gmail.com
> > wrote:
>
>> >> It asks: "How do I figure what of backup/restore feature is going to
>> be in
>> >>hbase-2.0.0?
>>
>> Hmm, wait for doc update.
>>
>>
>> On Fri, Sep 8, 2017 at 2:39 PM, Stack <st...@duboce.net> wrote:
>>
>>> HBASE-14414 is a JIRA with a list of random seeming issues w/
>>> non-descript
>>> summaries: "Add nonce support to TableBackupProcedure, BackupID must
>>> include backup set name, ...". The last comment in that issue is from
>>> July.
>>> It asks: "How do I figure what of backup/restore feature is going to be
>>> in
>>> hbase-2.0.0? Thanks Vladimir Rodionov
>>> <https://issues.apache.org/jira/secure/ViewProfile.jspa?name=vrodionov
>>> >."
>>> to which there is no answer.  Doc update is TODO.
>>>
>>> Where is the summary of the capability in hbase-2? What testing and at
>>> what
>>> scale has testing been done? Is this 'stable or experimental'? If I can't
>>> get basic info on this feature though I ask repeatedly, what hope does
>>> the
>>> poor old operator have?
>>>
>>> St.Ack
>>>
>>>
>>> On Fri, Sep 8, 2017 at 1:59 PM, Vladimir Rodionov <
>>> vladrodio...@gmail.com>
>>> wrote:
>>>
>>> > HBASE-14414
>>> >
>>> > On Fri, Sep 8, 2017 at 1:14 PM, Stack <st...@duboce.net> wrote:
>>> >
>>> > > Where do I go to get the current status of this feature? Looking in
>>> JIRA
>>> > I
>>> > > see loads of issues open against backup including some against
>>> > hbase-2.0.0
>>> > > and no progress being made that I can discern.
>>> > >
>>> > > Thanks,
>>> > > S
>>> > >
>>> > >
>>> > >
>>> > > On Wed, Nov 23, 2016 at 8:52 AM, Stack <st...@duboce.net> wrote:
>>> > >
>>> > > > On Tue, Nov 22, 2016 at 6:48 PM, Stack <st...@duboce.net> wrote:
>>> > > >
>>> > > >> On Tue, Nov 22, 2016 at 3:17 PM, Vladimir Rodionov <
>>> > > >> vladrodio...@gmail.com> wrote:
>>> > > >>
>>> > > >>> >> and/or he answered most of the review feedback
>>> > > >>>
>>> > > >>> No, questions are still open, but I do not see any blockers and
>>> we
>>> > have
>>> > > >>> HBASE-16940 to address these questions.
>>> > > >>>
>>> > > >>>
>>> > > >> Agree. No blockers but stuff that should be dealt with (No one
>>> will
>>> > pay
>>> > > >> me any attention once merge goes in -- smile).
>>> > > >>
>>> > > >>
>>> > > > Let me clarify the above. I want review addressed before merge
>>> happens.
>>> > > > Sorry if any confusion.
>>> > > > St.Ack
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >> St.Ack
>>> > > >>
>>> > > >>
>>> > > >>
>>> > > >>> On Tue, Nov 22, 2016 at 3:04 PM, Devaraj Das <
>>> d...@hortonworks.com>
>>> > > >>> wrote:
>>> > > >>>
>>> > > >>> > Hi Stack, hats off to you for spending so much time on this!
>>> > Thanks!
>>> > > >>> From
>>> > > >>> > my understanding, Vlad has raised follow-up jiras for the
>>> issues
>>> > you
>>> > > >>> > raised, and/or he answered most of the review feedback. So, do
>>> you
>>> > > >>> think we
>>> > > >>> > could do a merge vote now?
&g

Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2017-09-08 Thread Vladimir Rodionov
>> What testing and at what
>> scale has testing been done?

Do we have have that for other features?


On Fri, Sep 8, 2017 at 10:41 PM, Vladimir Rodionov <vladrodio...@gmail.com>
wrote:

> >> It asks: "How do I figure what of backup/restore feature is going to
> be in
> >>hbase-2.0.0?
>
> Hmm, wait for doc update.
>
>
> On Fri, Sep 8, 2017 at 2:39 PM, Stack <st...@duboce.net> wrote:
>
>> HBASE-14414 is a JIRA with a list of random seeming issues w/ non-descript
>> summaries: "Add nonce support to TableBackupProcedure, BackupID must
>> include backup set name, ...". The last comment in that issue is from
>> July.
>> It asks: "How do I figure what of backup/restore feature is going to be in
>> hbase-2.0.0? Thanks Vladimir Rodionov
>> <https://issues.apache.org/jira/secure/ViewProfile.jspa?name=vrodionov>."
>> to which there is no answer.  Doc update is TODO.
>>
>> Where is the summary of the capability in hbase-2? What testing and at
>> what
>> scale has testing been done? Is this 'stable or experimental'? If I can't
>> get basic info on this feature though I ask repeatedly, what hope does the
>> poor old operator have?
>>
>> St.Ack
>>
>>
>> On Fri, Sep 8, 2017 at 1:59 PM, Vladimir Rodionov <vladrodio...@gmail.com
>> >
>> wrote:
>>
>> > HBASE-14414
>> >
>> > On Fri, Sep 8, 2017 at 1:14 PM, Stack <st...@duboce.net> wrote:
>> >
>> > > Where do I go to get the current status of this feature? Looking in
>> JIRA
>> > I
>> > > see loads of issues open against backup including some against
>> > hbase-2.0.0
>> > > and no progress being made that I can discern.
>> > >
>> > > Thanks,
>> > > S
>> > >
>> > >
>> > >
>> > > On Wed, Nov 23, 2016 at 8:52 AM, Stack <st...@duboce.net> wrote:
>> > >
>> > > > On Tue, Nov 22, 2016 at 6:48 PM, Stack <st...@duboce.net> wrote:
>> > > >
>> > > >> On Tue, Nov 22, 2016 at 3:17 PM, Vladimir Rodionov <
>> > > >> vladrodio...@gmail.com> wrote:
>> > > >>
>> > > >>> >> and/or he answered most of the review feedback
>> > > >>>
>> > > >>> No, questions are still open, but I do not see any blockers and we
>> > have
>> > > >>> HBASE-16940 to address these questions.
>> > > >>>
>> > > >>>
>> > > >> Agree. No blockers but stuff that should be dealt with (No one will
>> > pay
>> > > >> me any attention once merge goes in -- smile).
>> > > >>
>> > > >>
>> > > > Let me clarify the above. I want review addressed before merge
>> happens.
>> > > > Sorry if any confusion.
>> > > > St.Ack
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >> St.Ack
>> > > >>
>> > > >>
>> > > >>
>> > > >>> On Tue, Nov 22, 2016 at 3:04 PM, Devaraj Das <
>> d...@hortonworks.com>
>> > > >>> wrote:
>> > > >>>
>> > > >>> > Hi Stack, hats off to you for spending so much time on this!
>> > Thanks!
>> > > >>> From
>> > > >>> > my understanding, Vlad has raised follow-up jiras for the issues
>> > you
>> > > >>> > raised, and/or he answered most of the review feedback. So, do
>> you
>> > > >>> think we
>> > > >>> > could do a merge vote now?
>> > > >>> > Devaraj.
>> > > >>> > 
>> > > >>> > From: Vladimir Rodionov <vladrodio...@gmail.com>
>> > > >>> > Sent: Monday, November 21, 2016 8:34 PM
>> > > >>> > To: dev@hbase.apache.org
>> > > >>> > Subject: Re: [DISCUSSION] Merge Backup / Restore - Branch
>> > HBASE-7912
>> > > >>> >
>> > > >>> > >> I have spent a good bit of time reviewing and testing this
>> > > feature.
>> > > >>> I
>> > > >>> > would
>> > > >>> > >> like my review and concerns addressed and 

Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2017-09-08 Thread Vladimir Rodionov
>> It asks: "How do I figure what of backup/restore feature is going to be
in
>>hbase-2.0.0?

Hmm, wait for doc update.


On Fri, Sep 8, 2017 at 2:39 PM, Stack <st...@duboce.net> wrote:

> HBASE-14414 is a JIRA with a list of random seeming issues w/ non-descript
> summaries: "Add nonce support to TableBackupProcedure, BackupID must
> include backup set name, ...". The last comment in that issue is from July.
> It asks: "How do I figure what of backup/restore feature is going to be in
> hbase-2.0.0? Thanks Vladimir Rodionov
> <https://issues.apache.org/jira/secure/ViewProfile.jspa?name=vrodionov>."
> to which there is no answer.  Doc update is TODO.
>
> Where is the summary of the capability in hbase-2? What testing and at what
> scale has testing been done? Is this 'stable or experimental'? If I can't
> get basic info on this feature though I ask repeatedly, what hope does the
> poor old operator have?
>
> St.Ack
>
>
> On Fri, Sep 8, 2017 at 1:59 PM, Vladimir Rodionov <vladrodio...@gmail.com>
> wrote:
>
> > HBASE-14414
> >
> > On Fri, Sep 8, 2017 at 1:14 PM, Stack <st...@duboce.net> wrote:
> >
> > > Where do I go to get the current status of this feature? Looking in
> JIRA
> > I
> > > see loads of issues open against backup including some against
> > hbase-2.0.0
> > > and no progress being made that I can discern.
> > >
> > > Thanks,
> > > S
> > >
> > >
> > >
> > > On Wed, Nov 23, 2016 at 8:52 AM, Stack <st...@duboce.net> wrote:
> > >
> > > > On Tue, Nov 22, 2016 at 6:48 PM, Stack <st...@duboce.net> wrote:
> > > >
> > > >> On Tue, Nov 22, 2016 at 3:17 PM, Vladimir Rodionov <
> > > >> vladrodio...@gmail.com> wrote:
> > > >>
> > > >>> >> and/or he answered most of the review feedback
> > > >>>
> > > >>> No, questions are still open, but I do not see any blockers and we
> > have
> > > >>> HBASE-16940 to address these questions.
> > > >>>
> > > >>>
> > > >> Agree. No blockers but stuff that should be dealt with (No one will
> > pay
> > > >> me any attention once merge goes in -- smile).
> > > >>
> > > >>
> > > > Let me clarify the above. I want review addressed before merge
> happens.
> > > > Sorry if any confusion.
> > > > St.Ack
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >> St.Ack
> > > >>
> > > >>
> > > >>
> > > >>> On Tue, Nov 22, 2016 at 3:04 PM, Devaraj Das <d...@hortonworks.com
> >
> > > >>> wrote:
> > > >>>
> > > >>> > Hi Stack, hats off to you for spending so much time on this!
> > Thanks!
> > > >>> From
> > > >>> > my understanding, Vlad has raised follow-up jiras for the issues
> > you
> > > >>> > raised, and/or he answered most of the review feedback. So, do
> you
> > > >>> think we
> > > >>> > could do a merge vote now?
> > > >>> > Devaraj.
> > > >>> > 
> > > >>> > From: Vladimir Rodionov <vladrodio...@gmail.com>
> > > >>> > Sent: Monday, November 21, 2016 8:34 PM
> > > >>> > To: dev@hbase.apache.org
> > > >>> > Subject: Re: [DISCUSSION] Merge Backup / Restore - Branch
> > HBASE-7912
> > > >>> >
> > > >>> > >> I have spent a good bit of time reviewing and testing this
> > > feature.
> > > >>> I
> > > >>> > would
> > > >>> > >> like my review and concerns addressed and I'd like it to be
> > clear
> > > >>> how;
> > > >>> > >> either explicit follow-on issues, pointers to where in the
> patch
> > > or
> > > >>> doc
> > > >>> > my
> > > >>> > >> remarks have been catered to, etc. Until then, I am against
> > > commit.
> > > >>> >
> > > >>> > Stack, mega patch review comments will be addressed in the
> > dedicated
> > > >>> JIRA:
> > > >>> > HBASE-16940
> > > >>> &g

Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2017-09-08 Thread Vladimir Rodionov
All blockers have been resolved. Some remaining JIRAs are good to have but
are not a blockers

On Fri, Sep 8, 2017 at 1:59 PM, Vladimir Rodionov <vladrodio...@gmail.com>
wrote:

> HBASE-14414
>
> On Fri, Sep 8, 2017 at 1:14 PM, Stack <st...@duboce.net> wrote:
>
>> Where do I go to get the current status of this feature? Looking in JIRA I
>> see loads of issues open against backup including some against hbase-2.0.0
>> and no progress being made that I can discern.
>>
>> Thanks,
>> S
>>
>>
>>
>> On Wed, Nov 23, 2016 at 8:52 AM, Stack <st...@duboce.net> wrote:
>>
>> > On Tue, Nov 22, 2016 at 6:48 PM, Stack <st...@duboce.net> wrote:
>> >
>> >> On Tue, Nov 22, 2016 at 3:17 PM, Vladimir Rodionov <
>> >> vladrodio...@gmail.com> wrote:
>> >>
>> >>> >> and/or he answered most of the review feedback
>> >>>
>> >>> No, questions are still open, but I do not see any blockers and we
>> have
>> >>> HBASE-16940 to address these questions.
>> >>>
>> >>>
>> >> Agree. No blockers but stuff that should be dealt with (No one will pay
>> >> me any attention once merge goes in -- smile).
>> >>
>> >>
>> > Let me clarify the above. I want review addressed before merge happens.
>> > Sorry if any confusion.
>> > St.Ack
>> >
>> >
>> >
>> >
>> >
>> >
>> >> St.Ack
>> >>
>> >>
>> >>
>> >>> On Tue, Nov 22, 2016 at 3:04 PM, Devaraj Das <d...@hortonworks.com>
>> >>> wrote:
>> >>>
>> >>> > Hi Stack, hats off to you for spending so much time on this! Thanks!
>> >>> From
>> >>> > my understanding, Vlad has raised follow-up jiras for the issues you
>> >>> > raised, and/or he answered most of the review feedback. So, do you
>> >>> think we
>> >>> > could do a merge vote now?
>> >>> > Devaraj.
>> >>> > 
>> >>> > From: Vladimir Rodionov <vladrodio...@gmail.com>
>> >>> > Sent: Monday, November 21, 2016 8:34 PM
>> >>> > To: dev@hbase.apache.org
>> >>> > Subject: Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912
>> >>> >
>> >>> > >> I have spent a good bit of time reviewing and testing this
>> feature.
>> >>> I
>> >>> > would
>> >>> > >> like my review and concerns addressed and I'd like it to be clear
>> >>> how;
>> >>> > >> either explicit follow-on issues, pointers to where in the patch
>> or
>> >>> doc
>> >>> > my
>> >>> > >> remarks have been catered to, etc. Until then, I am against
>> commit.
>> >>> >
>> >>> > Stack, mega patch review comments will be addressed in the dedicated
>> >>> JIRA:
>> >>> > HBASE-16940
>> >>> > I have open several other JIRAs to address your other comments (not
>> on
>> >>> > review board).
>> >>> >
>> >>> > Details are here (end of the thread):
>> >>> > https://issues.apache.org/jira/browse/HBASE-14123
>> >>> >
>> >>> > Let me know what else should we do to move merge forward.
>> >>> >
>> >>> > -Vlad
>> >>> >
>> >>> >
>> >>> > On Fri, Nov 18, 2016 at 4:54 PM, Stack <st...@duboce.net> wrote:
>> >>> >
>> >>> > > On Fri, Nov 18, 2016 at 3:53 PM, Ted Yu <yuzhih...@gmail.com>
>> wrote:
>> >>> > >
>> >>> > > > Thanks, Matteo.
>> >>> > > >
>> >>> > > > bq. restore is not clear if given an incremental id it will do
>> the
>> >>> full
>> >>> > > > restore from full up to that point or if i need to apply
>> manually
>> >>> > > > everything
>> >>> > > >
>> >>> > > > The restore takes into consideration of the dependent backup(s).
>> >>> > > > So there is no need to apply preceding backup(s) manually.
>> >>> > > >
>> >>> > > >
>&

Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2017-09-08 Thread Vladimir Rodionov
HBASE-14414

On Fri, Sep 8, 2017 at 1:14 PM, Stack <st...@duboce.net> wrote:

> Where do I go to get the current status of this feature? Looking in JIRA I
> see loads of issues open against backup including some against hbase-2.0.0
> and no progress being made that I can discern.
>
> Thanks,
> S
>
>
>
> On Wed, Nov 23, 2016 at 8:52 AM, Stack <st...@duboce.net> wrote:
>
> > On Tue, Nov 22, 2016 at 6:48 PM, Stack <st...@duboce.net> wrote:
> >
> >> On Tue, Nov 22, 2016 at 3:17 PM, Vladimir Rodionov <
> >> vladrodio...@gmail.com> wrote:
> >>
> >>> >> and/or he answered most of the review feedback
> >>>
> >>> No, questions are still open, but I do not see any blockers and we have
> >>> HBASE-16940 to address these questions.
> >>>
> >>>
> >> Agree. No blockers but stuff that should be dealt with (No one will pay
> >> me any attention once merge goes in -- smile).
> >>
> >>
> > Let me clarify the above. I want review addressed before merge happens.
> > Sorry if any confusion.
> > St.Ack
> >
> >
> >
> >
> >
> >
> >> St.Ack
> >>
> >>
> >>
> >>> On Tue, Nov 22, 2016 at 3:04 PM, Devaraj Das <d...@hortonworks.com>
> >>> wrote:
> >>>
> >>> > Hi Stack, hats off to you for spending so much time on this! Thanks!
> >>> From
> >>> > my understanding, Vlad has raised follow-up jiras for the issues you
> >>> > raised, and/or he answered most of the review feedback. So, do you
> >>> think we
> >>> > could do a merge vote now?
> >>> > Devaraj.
> >>> > 
> >>> > From: Vladimir Rodionov <vladrodio...@gmail.com>
> >>> > Sent: Monday, November 21, 2016 8:34 PM
> >>> > To: dev@hbase.apache.org
> >>> > Subject: Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912
> >>> >
> >>> > >> I have spent a good bit of time reviewing and testing this
> feature.
> >>> I
> >>> > would
> >>> > >> like my review and concerns addressed and I'd like it to be clear
> >>> how;
> >>> > >> either explicit follow-on issues, pointers to where in the patch
> or
> >>> doc
> >>> > my
> >>> > >> remarks have been catered to, etc. Until then, I am against
> commit.
> >>> >
> >>> > Stack, mega patch review comments will be addressed in the dedicated
> >>> JIRA:
> >>> > HBASE-16940
> >>> > I have open several other JIRAs to address your other comments (not
> on
> >>> > review board).
> >>> >
> >>> > Details are here (end of the thread):
> >>> > https://issues.apache.org/jira/browse/HBASE-14123
> >>> >
> >>> > Let me know what else should we do to move merge forward.
> >>> >
> >>> > -Vlad
> >>> >
> >>> >
> >>> > On Fri, Nov 18, 2016 at 4:54 PM, Stack <st...@duboce.net> wrote:
> >>> >
> >>> > > On Fri, Nov 18, 2016 at 3:53 PM, Ted Yu <yuzhih...@gmail.com>
> wrote:
> >>> > >
> >>> > > > Thanks, Matteo.
> >>> > > >
> >>> > > > bq. restore is not clear if given an incremental id it will do
> the
> >>> full
> >>> > > > restore from full up to that point or if i need to apply manually
> >>> > > > everything
> >>> > > >
> >>> > > > The restore takes into consideration of the dependent backup(s).
> >>> > > > So there is no need to apply preceding backup(s) manually.
> >>> > > >
> >>> > > >
> >>> > > I ask this question on the issue. It is not clear from the usage or
> >>> doc
> >>> > how
> >>> > > to run a restore from incremental. Can you fix in doc and usage how
> >>> so I
> >>> > > can be clear and try it. Currently I am stuck verifying a round
> trip
> >>> > backup
> >>> > > restore made of incrementals.
> >>> > >
> >>> > > Thanks,
> >>> > > S
> >>> > >
> >>> > >
> >>> > >
> >>

[jira] [Created] (HBASE-18646) [Backup] LogRollMasterProcedureManager: make procedure timeout, thread pool size configurable

2017-08-21 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-18646:
-

 Summary: [Backup] LogRollMasterProcedureManager: make procedure 
timeout, thread pool size configurable
 Key: HBASE-18646
 URL: https://issues.apache.org/jira/browse/HBASE-18646
 Project: HBase
  Issue Type: Improvement
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


The default procedure timeout of 60 sec and pool size (1) may be not optimal 
for large deployements



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Bug in FIFOCompactionPolicy pre-checks?

2017-08-04 Thread Vladimir Rodionov
Yes, file a JIRA, Lars

I will take a look

-Vlad


On Thu, Aug 3, 2017 at 11:41 PM, Lars George  wrote:

> Hi,
>
> See https://issues.apache.org/jira/browse/HBASE-14468
>
> It adds this check to {{HMaster.checkCompactionPolicy()}}:
>
> {code}
> // 1. Check TTL
> if (hcd.getTimeToLive() == HColumnDescriptor.DEFAULT_TTL) {
>   message = "Default TTL is not supported for FIFO compaction";
>   throw new IOException(message);
> }
>
> // 2. Check min versions
> if (hcd.getMinVersions() > 0) {
>   message = "MIN_VERSION > 0 is not supported for FIFO compaction";
>   throw new IOException(message);
> }
>
> // 3. blocking file count
> String sbfc = htd.getConfigurationValue(HStore.BLOCKING_STOREFILES_KEY);
> if (sbfc != null) {
>   blockingFileCount = Integer.parseInt(sbfc);
> }
> if (blockingFileCount < 1000) {
>   message =
>   "blocking file count '" + HStore.BLOCKING_STOREFILES_KEY + "' "
> + blockingFileCount
>   + " is below recommended minimum of 1000";
>   throw new IOException(message);
> }
> {code}
>
> Why does it only check the blocking file count on the HTD level, while
> others are check on the HCD level? Doing this for example fails
> because of it:
>
> {noformat}
> hbase(main):008:0> create 'ttltable', { NAME => 'cf1', TTL => 300,
> CONFIGURATION => { 'hbase.hstore.defaultengine.compactionpolicy.class'
> => 'org.apache.hadoop.hbase.regionserver.compactions.
> FIFOCompactionPolicy',
> 'hbase.hstore.blockingStoreFiles' => 2000 } }
>
> ERROR: org.apache.hadoop.hbase.DoNotRetryIOException: blocking file
> count 'hbase.hstore.blockingStoreFiles' 10 is below recommended
> minimum of 1000 Set hbase.table.sanity.checks to false at conf or
> table descriptor if you want to bypass sanity checks
> at org.apache.hadoop.hbase.master.HMaster.warnOrThrowExceptionForFailure
> (HMaster.java:1782)
> at org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(
> HMaster.java:1663)
> at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1545)
> at org.apache.hadoop.hbase.master.MasterRpcServices.
> createTable(MasterRpcServices.java:469)
> at org.apache.hadoop.hbase.protobuf.generated.
> MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:58549)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(
> RpcExecutor.java:188)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(
> RpcExecutor.java:168)
> Caused by: java.io.IOException: blocking file count
> 'hbase.hstore.blockingStoreFiles' 10 is below recommended minimum of
> 1000
> at org.apache.hadoop.hbase.master.HMaster.checkCompactionPolicy(HMaster.
> java:1773)
> at org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(
> HMaster.java:1661)
> ... 7 more
> {noformat}
>
> That should work on the column family level, right? Shall I file a JIRA?
>
> Cheers,
> Lars
>


[jira] [Resolved] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2017-07-25 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-7912.
--
Resolution: Fixed

Closing this one. Refer to https://issues.apache.org/jira/browse/HBASE-14414 
for any further updates

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>    Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore -0.91.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore-v0.9.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of 
> HFiles for fast incremental restore.
> * Support single table or a set of tables, and column family level backup and 
> restore.
> * Restore to different table names.
> * Support adding additional tables or CF to backup 

[jira] [Created] (HBASE-18425) Fix TestMasterFailover

2017-07-20 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-18425:
-

 Summary: Fix TestMasterFailover
 Key: HBASE-18425
 URL: https://issues.apache.org/jira/browse/HBASE-18425
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18424) Fix TestAsyncTableGetMultiThreaded

2017-07-20 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-18424:
-

 Summary: Fix TestAsyncTableGetMultiThreaded
 Key: HBASE-18424
 URL: https://issues.apache.org/jira/browse/HBASE-18424
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18423) Fix TestMetaWithReplicas

2017-07-20 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-18423:
-

 Summary: Fix TestMetaWithReplicas
 Key: HBASE-18423
 URL: https://issues.apache.org/jira/browse/HBASE-18423
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18422) Fix TestRegionRebalancing

2017-07-20 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-18422:
-

 Summary: Fix TestRegionRebalancing
 Key: HBASE-18422
 URL: https://issues.apache.org/jira/browse/HBASE-18422
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (HBASE-16458) Shorten backup / restore test execution time

2017-07-06 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov reopened HBASE-16458:
---

> Shorten backup / restore test execution time
> 
>
> Key: HBASE-16458
> URL: https://issues.apache.org/jira/browse/HBASE-16458
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Ted Yu
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: 16458.HBASE-7912.v3.txt, 16458.HBASE-7912.v4.txt, 
> 16458.HBASE-7912.v5.txt, 16458.v1.txt, 16458.v2.txt, 16458.v3.txt
>
>
> Below was timing information for all the backup / restore tests (today's 
> result):
> {code}
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 576.273 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackup
> Running org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.67 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.34 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupStatusProgress
> Running org.apache.hadoop.hbase.backup.TestBackupAdmin
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 490.251 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupAdmin
> Running org.apache.hadoop.hbase.backup.TestHFileArchiving
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.323 sec - 
> in org.apache.hadoop.hbase.backup.TestHFileArchiving
> Running org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.492 sec - 
> in org.apache.hadoop.hbase.backup.TestSystemTableSnapshot
> Running org.apache.hadoop.hbase.backup.TestBackupDescribe
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.758 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDescribe
> Running org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 109.187 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupLogCleaner
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 330.539 sec - 
> in org.apache.hadoop.hbase.backup.TestIncrementalBackupNoDataLoss
> Running org.apache.hadoop.hbase.backup.TestRemoteBackup
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.371 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteBackup
> Running org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.893 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupSystemTable
> Running org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.779 sec - 
> in org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests
> Running org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.815 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSetRestoreSet
> Running org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 136.517 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupShowHistory
> Running org.apache.hadoop.hbase.backup.TestRemoteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 91.799 sec - 
> in org.apache.hadoop.hbase.backup.TestRemoteRestore
> Running org.apache.hadoop.hbase.backup.TestFullRestore
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.711 sec 
> - in org.apache.hadoop.hbase.backup.TestFullRestore
> Running org.apache.hadoop.hbase.backup.TestFullBackupSet
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.045 sec - 
> in org.apache.hadoop.hbase.backup.TestFullBackupSet
> Running org.apache.hadoop.hbase.backup.TestBackupDelete
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 86.214 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDelete
> Running org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 77.631 sec - 
> in org.apache.hadoop.hbase.backup.TestBackupDeleteRestore
> Running org.apache.hadoop.hbase.backup.TestIncrementalBackupDeleteTable
> Tests run: 1, Failures: 0, Errors: 0, S

  1   2   3   4   5   6   7   >