[jira] [Commented] (HBASE-20357) AccessControlClient API Enhancement

2018-06-29 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527219#comment-16527219
 ] 

Ted Yu commented on HBASE-20357:


AccessControlClient is marked InterfaceAudience.Public

If this goes to branch-2, the changes must be backward compatible.

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20357) AccessControlClient API Enhancement

2018-06-28 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20357:
---
Fix Version/s: 3.0.0

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20357) AccessControlClient API Enhancement

2018-06-28 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527214#comment-16527214
 ] 

Ted Yu commented on HBASE-20357:


Just committed patch v3.

Please attach addendum.

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20357) AccessControlClient API Enhancement

2018-06-28 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522157#comment-16522157
 ] 

Ted Yu edited comment on HBASE-20357 at 6/29/18 2:50 AM:
-

[~pankaj2461]
Please fill out release note.


was (Author: yuzhih...@gmail.com):
Please fill out release note.

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20791) RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to its internalBalancer

2018-06-28 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20791:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.2.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch, chen.

Thanks for the review, Reid.

> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to its 
> internalBalancer
> -
>
> Key: HBASE-20791
> URL: https://issues.apache.org/jira/browse/HBASE-20791
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Affects Versions: 3.0.0, 2.0.0
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: 20791-master-v2.patch, HBASE-20791-branch-2-v1.patch, 
> HBASE-20791-master-v1.patch, HBASE-20791-master-v3.patch, 
> HBASE-20791-master-v4.patch
>
>
> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
> internalBalancer, Or the StochasticLoadBalancer(internal balancer) will lose 
> it's Up-to-date RegionLoads info, and effect the balance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20806) Split style journal for flushes and compactions

2018-06-28 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526233#comment-16526233
 ] 

Ted Yu commented on HBASE-20806:


Are you targeting 1.x or 2.0 for this work ?

Thanks

> Split style journal for flushes and compactions
> ---
>
> Key: HBASE-20806
> URL: https://issues.apache.org/jira/browse/HBASE-20806
> Project: HBase
>  Issue Type: Improvement
>Reporter: Abhishek Singh Chouhan
>Assignee: Abhishek Singh Chouhan
>Priority: Minor
>
> In 1.x we have split transaction journal that gives a clear picture of when 
> various stages of splits took place. We should have a similar thing for 
> flushes and compactions so as to have insights into time spent in various 
> stages, which we can use to identify regressions that might creep up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20791) RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to its internalBalancer

2018-06-28 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526058#comment-16526058
 ] 

Ted Yu commented on HBASE-20791:


{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.6.1:testCompile 
(default-testCompile) on project hbase-rsgroup: Compilation failure
[ERROR] 
/Users/tyu/2-hbase/hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/master/balancer/TestRSGroupBasedLoadBalancerWithStochasticLoadBalancerAsInternal.java:[88,14]
 cannot find symbol
[ERROR]   symbol:   method getCpRequestCount()
[ERROR]   location: variable rl of type org.apache.hadoop.hbase.RegionMetrics
{code}
Please attach patch for branch-2

> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to its 
> internalBalancer
> -
>
> Key: HBASE-20791
> URL: https://issues.apache.org/jira/browse/HBASE-20791
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Affects Versions: 3.0.0, 2.0.0
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: 20791-master-v2.patch, HBASE-20791-master-v1.patch, 
> HBASE-20791-master-v3.patch, HBASE-20791-master-v4.patch
>
>
> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
> internalBalancer, Or the StochasticLoadBalancer(internal balancer) will lose 
> it's Up-to-date RegionLoads info, and effect the balance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-15320) HBase connector for Kafka Connect

2018-06-28 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525956#comment-16525956
 ] 

Ted Yu commented on HBASE-15320:


For 
hbase-kafka-proxy/src/main/java/org/apache/hadoop/hbase/kafka/HbaseKafkaProxy.java
 , I think the prefix Hbase in the class name is not needed - the class is in 
org.apache.hadoop.hbase.kafka package.
{code}
+.filter((arg)->(arg.startsWith("-D")||arg.equalsIgnoreCase("start")))
{code}
Do we accept spelling(s) for starting region server other than "start" ?
{code}
+List addArgs = DEFAULT_PROPERTIES.keySet().stream()
{code}
You can name addArgs variable allArgs.
{code}
+LOG.warn("znode "+idPath+" has unexpected value "
++ " (did the peer name for the proxy change?) "
{code}
Include the current value for the znode in the log.
{code}
+   * @param createIfMissing if the peer doesn't exist, create it and peer to 
it.
+   */
{code}
Please add javadoc for enablePeer parameter.

Will leave more comments once the patch is uploaded to review board.

> HBase connector for Kafka Connect
> -
>
> Key: HBASE-15320
> URL: https://issues.apache.org/jira/browse/HBASE-15320
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: Andrew Purtell
>Assignee: Mike Wingert
>Priority: Major
>  Labels: beginner
> Fix For: 3.0.0
>
> Attachments: HBASE-15320.master.1.patch, HBASE-15320.master.10.patch, 
> HBASE-15320.master.11.patch, HBASE-15320.master.12.patch, 
> HBASE-15320.master.2.patch, HBASE-15320.master.3.patch, 
> HBASE-15320.master.4.patch, HBASE-15320.master.5.patch, 
> HBASE-15320.master.6.patch, HBASE-15320.master.7.patch, 
> HBASE-15320.master.8.patch, HBASE-15320.master.8.patch, 
> HBASE-15320.master.9.patch, HBASE-15320.pdf, HBASE-15320.pdf
>
>
> Implement an HBase connector with source and sink tasks for the Connect 
> framework (http://docs.confluent.io/2.0.0/connect/index.html) available in 
> Kafka 0.9 and later.
> See also: 
> http://www.confluent.io/blog/announcing-kafka-connect-building-large-scale-low-latency-data-pipelines
> An HBase source 
> (http://docs.confluent.io/2.0.0/connect/devguide.html#task-example-source-task)
>  could be implemented as a replication endpoint or WALObserver, publishing 
> cluster wide change streams from the WAL to one or more topics, with 
> configurable mapping and partitioning of table changes to topics.  
> An HBase sink task 
> (http://docs.confluent.io/2.0.0/connect/devguide.html#sink-tasks) would 
> persist, with optional transformation (JSON? Avro?, map fields to native 
> schema?), Kafka SinkRecords into HBase tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-6028) Implement a cancel for in-progress compactions

2018-06-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525904#comment-16525904
 ] 

Ted Yu commented on HBASE-6028:
---

Mohit:
You don't need to remove previous patches.
They would give reviewers idea how the implementation evolves over time.

FYI

> Implement a cancel for in-progress compactions
> --
>
> Key: HBASE-6028
> URL: https://issues.apache.org/jira/browse/HBASE-6028
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Derek Wollenstein
>Assignee: Mohit Goel
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-6028.master.007.patch
>
>
> Depending on current server load, it can be extremely expensive to run 
> periodic minor / major compactions.  It would be helpful to have a feature 
> where a user could use the shell or a client tool to explicitly cancel an 
> in-progress compactions.  This would allow a system to recover when too many 
> regions became eligible for compactions at once



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20791) RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to its internalBalancer

2018-06-27 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20791:
---
Summary: RSGroupBasedLoadBalancer#setClusterMetrics should pass 
ClusterMetrics to its internalBalancer  (was: 
RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
internalBalancer)

> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to its 
> internalBalancer
> -
>
> Key: HBASE-20791
> URL: https://issues.apache.org/jira/browse/HBASE-20791
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Affects Versions: 3.0.0, 2.0.0
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: 20791-master-v2.patch, HBASE-20791-master-v1.patch, 
> HBASE-20791-master-v3.patch
>
>
> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
> internalBalancer, Or the StochasticLoadBalancer(internal balancer) will lose 
> it's Up-to-date RegionLoads info, and effect the balance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20791) RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s internalBalancer

2018-06-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525771#comment-16525771
 ] 

Ted Yu commented on HBASE-20791:


I did a search under hbase-server/src/test/ and found the following classes 
whose names end in Base:
{code}
hbase-server/src/test//java/org/apache/hadoop/hbase/AcidGuaranteesTestBase.java
hbase-server/src/test//java/org/apache/hadoop/hbase/client/TestAsyncAdminBase.java
hbase-server/src/test//java/org/apache/hadoop/hbase/master/balancer/BalancerTestBase.java
hbase-server/src/test//java/org/apache/hadoop/hbase/master/procedure/TestTableDDLProcedureBase.java
hbase-server/src/test//java/org/apache/hadoop/hbase/replication/SerialReplicationTestBase.java
hbase-server/src/test//java/org/apache/hadoop/hbase/replication/TestReplicationBase.java
hbase-server/src/test//java/org/apache/hadoop/hbase/security/visibility/VisibilityLabelsWithDeletesTestBase.java
hbase-server/src/test//java/org/apache/hadoop/hbase/TimestampTestBase.java
hbase-server/src/test//java/org/apache/hadoop/hbase/util/MultiThreadedWriterBase.java
{code}
I think the previous naming with TestBase is Okay.

There is no need to change the naming again (from patch v3).

Please address checkstyle warning and it should be good to go.

> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
> internalBalancer
> --
>
> Key: HBASE-20791
> URL: https://issues.apache.org/jira/browse/HBASE-20791
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Affects Versions: 3.0.0, 2.0.0
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: 20791-master-v2.patch, HBASE-20791-master-v1.patch, 
> HBASE-20791-master-v3.patch
>
>
> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
> internalBalancer, Or the StochasticLoadBalancer(internal balancer) will lose 
> it's Up-to-date RegionLoads info, and effect the balance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20798) Duplicate thread names of StoreFileOpenerThread and StoreFileCloserThread

2018-06-27 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20798:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for the patch, Zephyr

Thanks for the review, Allan.

> Duplicate thread names of StoreFileOpenerThread and StoreFileCloserThread
> -
>
> Key: HBASE-20798
> URL: https://issues.apache.org/jira/browse/HBASE-20798
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 3.0.0, 2.0.0
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20798-v1.patch, HBASE-20798-v2.patch
>
>
> {code:title=jstack}
>  "StoreFileOpenerThread-info-1" #8994 daemon prio=5 os_prio=0 
> tid=0x7fad7a46b000 nid=0x624a waiting on condition [0x7fad7f7ef000]
>  "StoreFileOpenerThread-info-1" #8993 daemon prio=5 os_prio=0 
> tid=0x7fad71064000 nid=0x6249 waiting on condition [0x7fad7fff7000]
> {code}
> Duplicated thread names exists in jstack sometimes, because 
> StoreFileOpenerThreads are created per region and have same names. 
> Suggest adding region name to corresponding thread name in order to 
> distinguish StoreFileOpenerThreads, which are created per region. This could 
> be helpful for troubleShooting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20798) Duplicate thread names of StoreFileOpenerThread and StoreFileCloserThread

2018-06-27 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20798:
---
Summary: Duplicate thread names of StoreFileOpenerThread and 
StoreFileCloserThread  (was: Duplicated threads name of 
StoreFileOpenerThread)

> Duplicate thread names of StoreFileOpenerThread and StoreFileCloserThread
> -
>
> Key: HBASE-20798
> URL: https://issues.apache.org/jira/browse/HBASE-20798
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 3.0.0, 2.0.0
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20798-v1.patch, HBASE-20798-v2.patch
>
>
> {code:title=jstack}
>  "StoreFileOpenerThread-info-1" #8994 daemon prio=5 os_prio=0 
> tid=0x7fad7a46b000 nid=0x624a waiting on condition [0x7fad7f7ef000]
>  "StoreFileOpenerThread-info-1" #8993 daemon prio=5 os_prio=0 
> tid=0x7fad71064000 nid=0x6249 waiting on condition [0x7fad7fff7000]
> {code}
> Duplicated thread names exists in jstack sometimes, because 
> StoreFileOpenerThreads are created per region and have same names. 
> Suggest adding region name to corresponding thread name in order to 
> distinguish StoreFileOpenerThreads, which are created per region. This could 
> be helpful for troubleShooting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20798) Duplicated threads name of StoreFileOpenerThread

2018-06-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524969#comment-16524969
 ] 

Ted Yu commented on HBASE-20798:


{code}
560 + this.region.getRegionInfo().getEncodedName() + 
this.getColumnFamilyName());
{code}
Please add a space between encoded name and family name.

> Duplicated threads name of StoreFileOpenerThread
> --
>
> Key: HBASE-20798
> URL: https://issues.apache.org/jira/browse/HBASE-20798
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 3.0.0, 2.0.0
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20798-v1.patch
>
>
> {code:title=jstack}
>  "StoreFileOpenerThread-info-1" #8994 daemon prio=5 os_prio=0 
> tid=0x7fad7a46b000 nid=0x624a waiting on condition [0x7fad7f7ef000]
>  "StoreFileOpenerThread-info-1" #8993 daemon prio=5 os_prio=0 
> tid=0x7fad71064000 nid=0x6249 waiting on condition [0x7fad7fff7000]
> {code}
> Duplicated thread names exists in jstack sometimes, because 
> StoreFileOpenerThreads are created per region and have same names. 
> Suggest adding region name to corresponding thread name in order to 
> distinguish StoreFileOpenerThreads, which are created per region. This could 
> be helpful for troubleShooting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20791) RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s internalBalancer

2018-06-26 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524592#comment-16524592
 ] 

Ted Yu commented on HBASE-20791:


[~reidchan]:
Do you have other comments w.r.t. patch v2 ?

> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
> internalBalancer
> --
>
> Key: HBASE-20791
> URL: https://issues.apache.org/jira/browse/HBASE-20791
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Affects Versions: 3.0.0, 2.0.0
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: 20791-master-v2.patch, HBASE-20791-master-v1.patch
>
>
> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
> internalBalancer, Or the StochasticLoadBalancer(internal balancer) will lose 
> it's Up-to-date RegionLoads info, and effect the balance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20791) RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s internalBalancer

2018-06-26 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20791:
---
Attachment: 20791-master-v2.patch

> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
> internalBalancer
> --
>
> Key: HBASE-20791
> URL: https://issues.apache.org/jira/browse/HBASE-20791
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Affects Versions: 3.0.0, 2.0.0
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: 20791-master-v2.patch, HBASE-20791-master-v1.patch
>
>
> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
> internalBalancer, Or the StochasticLoadBalancer(internal balancer) will lose 
> it's Up-to-date RegionLoads info, and effect the balance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-6028) Implement a cancel for in-progress compactions

2018-06-26 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524446#comment-16524446
 ] 

Ted Yu commented on HBASE-6028:
---

{code}
2018-06-26 13:37:22,042 ERROR 
[RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16022-shortCompactions-1530045419380]
 regionserver.CompactSplit: Compaction failed
{code}
When compaction is interrupted, it is not really an error.
Within CompactSplit, whether compaction is enabled or not is known to the 
CompactionRunner. In the above case, can we replace the error log with INFO log 
saying compaction is interrupted ?

> Implement a cancel for in-progress compactions
> --
>
> Key: HBASE-6028
> URL: https://issues.apache.org/jira/browse/HBASE-6028
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Derek Wollenstein
>Assignee: Mohit Goel
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-6028.master.006.patch
>
>
> Depending on current server load, it can be extremely expensive to run 
> periodic minor / major compactions.  It would be helpful to have a feature 
> where a user could use the shell or a client tool to explicitly cancel an 
> in-progress compactions.  This would allow a system to recover when too many 
> regions became eligible for compactions at once



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20789) TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky

2018-06-26 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20789:
---
Attachment: bucket-33718.out

> TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky
> ---
>
> Key: HBASE-20789
> URL: https://issues.apache.org/jira/browse/HBASE-20789
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Attachments: bucket-33718.out
>
>
> The UT failed frequently in our internal branch-2... Will dig into the UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20789) TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky

2018-06-26 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524277#comment-16524277
 ] 

Ted Yu commented on HBASE-20789:


https://builds.apache.org/job/HBASE-Flaky-Tests/33718/testReport/junit/org.apache.hadoop.hbase.io.hfile.bucket/TestBucketCache/testCacheBlockNextBlockMetadataMissing_0__blockSize_8_192__bucketSizes_null_/

I will attach the above test output.

> TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky
> ---
>
> Key: HBASE-20789
> URL: https://issues.apache.org/jira/browse/HBASE-20789
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
>
> The UT failed frequently in our internal branch-2... Will dig into the UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20789) TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky

2018-06-26 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524186#comment-16524186
 ] 

Ted Yu commented on HBASE-20789:


Zack:
You can get test failure from:
https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html

> TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky
> ---
>
> Key: HBASE-20789
> URL: https://issues.apache.org/jira/browse/HBASE-20789
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
>
> The UT failed frequently in our internal branch-2... Will dig into the UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20791) RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s internalBalancer

2018-06-26 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523950#comment-16523950
 ] 

Ted Yu commented on HBASE-20791:


Can you put the patch on review board after fixing failing tests ?

> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
> internalBalancer
> --
>
> Key: HBASE-20791
> URL: https://issues.apache.org/jira/browse/HBASE-20791
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup
>Affects Versions: 3.0.0, 2.0.0
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: HBASE-20791-master-v1.patch
>
>
> RSGroupBasedLoadBalancer#setClusterMetrics should pass ClusterMetrics to it’s 
> internalBalancer, Or the StochasticLoadBalancer(internal balancer) will lose 
> it's Up-to-date RegionLoads info, and effect the balance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20769) getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl

2018-06-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522195#comment-16522195
 ] 

Ted Yu commented on HBASE-20769:


[~openinx]:
Can you take a look as well ?

If this is good by you, can you help commit ?

Thanks

> getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl
> ---
>
> Key: HBASE-20769
> URL: https://issues.apache.org/jira/browse/HBASE-20769
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 2.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-20769.master.001.patch, 
> HBASE-20769.master.002.patch, HBASE-20769.master.003.patch
>
>
> When numSplits > 1, getSplits may create split that has start row smaller 
> than user specified scan's start row or stop row larger than user specified 
> scan's stop row.
> {code}
> byte[][] sp = sa.split(hri.getStartKey(), hri.getEndKey(), numSplits, 
> true);
> for (int i = 0; i < sp.length - 1; i++) {
>   if (PrivateCellUtil.overlappingKeys(scan.getStartRow(), 
> scan.getStopRow(), sp[i],
>   sp[i + 1])) {
> List hosts =
> calculateLocationsForInputSplit(conf, htd, hri, tableDir, 
> localityEnabled);
> Scan boundedScan = new Scan(scan);
> boundedScan.setStartRow(sp[i]);
> boundedScan.setStopRow(sp[i + 1]);
> splits.add(new InputSplit(htd, hri, hosts, boundedScan, 
> restoreDir));
>   }
> }
> {code}
> Since we split keys by the range of regions, when sp[i] < scan.getStartRow() 
> or sp[i + 1] > scan.getStopRow(), the created bounded scan may contain range 
> that over user defined scan.
> fix should be simple:
> {code}
> boundedScan.setStartRow(
>  Bytes.compareTo(scan.getStartRow(), sp[i]) > 0 ? scan.getStartRow() : sp[i]);
>  boundedScan.setStopRow(
>  Bytes.compareTo(scan.getStopRow(), sp[i + 1]) < 0 ? scan.getStopRow() : sp[i 
> + 1]);
> {code}
> I will also try to add UTs to help discover this problem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20357) AccessControlClient API Enhancement

2018-06-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522157#comment-16522157
 ] 

Ted Yu commented on HBASE-20357:


Please fill out release note.

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20357) AccessControlClient API Enhancement

2018-06-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522152#comment-16522152
 ] 

Ted Yu commented on HBASE-20357:


+1

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch, HBASE-20357.master.003.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20769) getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl

2018-06-24 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16521806#comment-16521806
 ] 

Ted Yu commented on HBASE-20769:


lgtm

Please wrap the long line.

> getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl
> ---
>
> Key: HBASE-20769
> URL: https://issues.apache.org/jira/browse/HBASE-20769
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 2.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-20769.master.001.patch, 
> HBASE-20769.master.002.patch
>
>
> When numSplits > 1, getSplits may create split that has start row smaller 
> than user specified scan's start row or stop row larger than user specified 
> scan's stop row.
> {code}
> byte[][] sp = sa.split(hri.getStartKey(), hri.getEndKey(), numSplits, 
> true);
> for (int i = 0; i < sp.length - 1; i++) {
>   if (PrivateCellUtil.overlappingKeys(scan.getStartRow(), 
> scan.getStopRow(), sp[i],
>   sp[i + 1])) {
> List hosts =
> calculateLocationsForInputSplit(conf, htd, hri, tableDir, 
> localityEnabled);
> Scan boundedScan = new Scan(scan);
> boundedScan.setStartRow(sp[i]);
> boundedScan.setStopRow(sp[i + 1]);
> splits.add(new InputSplit(htd, hri, hosts, boundedScan, 
> restoreDir));
>   }
> }
> {code}
> Since we split keys by the range of regions, when sp[i] < scan.getStartRow() 
> or sp[i + 1] > scan.getStopRow(), the created bounded scan may contain range 
> that over user defined scan.
> fix should be simple:
> {code}
> boundedScan.setStartRow(
>  Bytes.compareTo(scan.getStartRow(), sp[i]) > 0 ? scan.getStartRow() : sp[i]);
>  boundedScan.setStopRow(
>  Bytes.compareTo(scan.getStopRow(), sp[i + 1]) < 0 ? scan.getStopRow() : sp[i 
> + 1]);
> {code}
> I will also try to add UTs to help discover this problem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20403) Prefetch sometimes doesn't work with encrypted file system

2018-06-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520778#comment-16520778
 ] 

Ted Yu commented on HBASE-20403:


+1

> Prefetch sometimes doesn't work with encrypted file system
> --
>
> Key: HBASE-20403
> URL: https://issues.apache.org/jira/browse/HBASE-20403
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-2
>Reporter: Umesh Agashe
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: hbase-20403.patch, hbase-20403.patch
>
>
> Log from long running test has following stack trace a few times:
> {code}
> 2018-04-09 18:33:21,523 WARN 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl: Prefetch 
> path=hdfs://ns1/hbase/data/default/IntegrationTestBigLinkedList_20180409172704/35f1a7ef13b9d327665228abdbcdffae/meta/9089d98b2a6b4847b3fcf6aceb124988,
>  offset=36884200, end=231005989
> java.lang.IllegalArgumentException
>   at java.nio.Buffer.limit(Buffer.java:275)
>   at 
> org.apache.hadoop.hdfs.ByteBufferStrategy.readFromBlock(ReaderStrategy.java:183)
>   at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:705)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:766)
>   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:831)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:197)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:762)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readAtOffset(HFileBlock.java:1559)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1771)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1594)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1488)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$1.run(HFileReaderImpl.java:278)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> Size on disk calculations seem to get messed up due to encryption. Possible 
> fixes can be:
> * if file is encrypted with FileStatus#isEncrypted() and do not prefetch.
> * document that hbase.rs.prefetchblocksonopen cannot be true if file is 
> encrypted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20357) AccessControlClient API Enhancement

2018-06-22 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520764#comment-16520764
 ] 

Ted Yu commented on HBASE-20357:


There are 3 pairs of
bq. result.getParams().addExtraParam("filterUser", filterUser);

in if / else blocks which can be lifted outside the if.
For class InputUser :
{code}
+public String toString() {
+  return name;
{code}
Should group name be included ?
{code}
+  public static List getUserGroups(String user) {
+List userGroup = new ArrayList();
{code}
Looks like the empty ArrayList is only needed in case of IOE.
I think you can move the assignment of the empty ArrayList to the catch block.
{code}
+   * Returns the currently granted permissions for a given table with 
associated permissions based
+   * on the specified column family, column qualifier and user name.
+   */
+  static List getUserPermissions(Configuration conf, byte[] 
entryName, byte[] cf,
{code}
The entryName may mean namespace. Please modify javadoc to reflect this.
{code}
+  if (filterUser != null) {
+// Validate the filterUser when specified
+if (!validateFilterUser(username, filterUser, filterUserGroups)
+|| !validateCFAndCQ(permFamily, cf, permQualifier, cq)) {
{code}
The validateCFAndCQ call is common to with and without filterUser. It can be 
lifted outside the if.

> AccessControlClient API Enhancement
> ---
>
> Key: HBASE-20357
> URL: https://issues.apache.org/jira/browse/HBASE-20357
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Attachments: HBASE-20357.master.001.patch, 
> HBASE-20357.master.002.patch
>
>
> *Background:*
>  Currently HBase ACLs can be retrieved based on the namespace or table name 
> only. There is no direct API available to retrieve the permissions based on 
> the namespace, table name, column family and column qualifier for specific 
> user.
> Client has to write application logic in multiple steps to retrieve ACLs 
> based on table name, column name and column qualifier for specific user.
>  HBase should enhance AccessControlClient APIs to simplyfy this.
> *AccessControlClient API should be extended with following APIs,*    
>  # To retrieve permissions based on the namespace, table name, column family 
> and column qualifier for specific user.
>   Permissions can be retrieved based on the following inputs,
>        - Namespace/Table (already available)
>        - Namespace/Table + UserName
>        - Table + CF
>        - Table + CF + UserName
>        - Table + CF + CQ
>        - Table + CF + CQ + UserName
>           Scope of retrieving permission will be as follows,
>                  - Same as existing
>        2. To validate whether a user is allowed to perform specified 
> operations on a particular table, will be useful to check user privilege 
> instead of getting ACD during client                                    
> operation.
>              User validation can be performed based on following inputs, 
>                   - Table + CF + CQ + UserName + Actions
>             Scope of validating user privilege,
>                     User can perform self check without any special privilege 
> but ADMIN privilege will be required to perform check for other users.
>                     For example, suppose there are two users "userA" & 
> "userB" then there can be below scenarios,
>                         - when userA want to check whether userA have 
> privilege to perform mentioned actions
>                                 > userA don't need ADMIN privilege, as it's a 
> self query.
>                        - when userA want to check whether userB have 
> privilege to perform mentioned actions,
>                                 > userA must have ADMIN or superuser 
> privilege, as it's trying to query for other user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20740) StochasticLoadBalancer should consider CoprocessorService request factor when computing cost

2018-06-22 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20740:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Ran TestStochasticLoadBalancer locally with patch v3 which passed.

Thanks for the patch, chen.

> StochasticLoadBalancer should consider CoprocessorService request factor when 
> computing cost
> 
>
> Key: HBASE-20740
> URL: https://issues.apache.org/jira/browse/HBASE-20740
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: 20740-master-v2.patch, HBASE-20740-master-v1.patch, 
> HBASE-20740-master-v3.patch
>
>
> When compute region load cost, ReadRequest, WriteRequest, MemStoreSize and 
> StoreFileSize are considered, But CoprocessorService requests are ignored. In 
> our KYLIN cluster, there only have CoprocessorService requests, and the 
> cluster sometimes unbalanced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20740) StochasticLoadBalancer should consider CoprocessorService request factor when computing cost

2018-06-22 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20740:
---
Release Note: 
Introduce CPRequestCostFunction for StochasticLoadBalancer, which will consider 
CoprocessorService request count when computing region load cost.
the multiplier can be set by hbase.master.balancer.stochastic.cpRequestCost, 
default value is 5.

  was:
Introduce CPRequestCostFunction for StochasticLoadBalancer, which will consider 
CoprocessorService request count when compute region load cost.
the multiplier can be set by hbase.master.balancer.stochastic.cpRequestCost, 
default value is 5.


> StochasticLoadBalancer should consider CoprocessorService request factor when 
> computing cost
> 
>
> Key: HBASE-20740
> URL: https://issues.apache.org/jira/browse/HBASE-20740
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: 20740-master-v2.patch, HBASE-20740-master-v1.patch, 
> HBASE-20740-master-v3.patch
>
>
> When compute region load cost, ReadRequest, WriteRequest, MemStoreSize and 
> StoreFileSize are considered, But CoprocessorService requests are ignored. In 
> our KYLIN cluster, there only have CoprocessorService requests, and the 
> cluster sometimes unbalanced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20740) StochasticLoadBalancer should consider CoprocessorService request factor when computing cost

2018-06-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519759#comment-16519759
 ] 

Ted Yu commented on HBASE-20740:


Please fill out release note.

> StochasticLoadBalancer should consider CoprocessorService request factor when 
> computing cost
> 
>
> Key: HBASE-20740
> URL: https://issues.apache.org/jira/browse/HBASE-20740
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: 20740-master-v2.patch, HBASE-20740-master-v1.patch
>
>
> When compute region load cost, ReadRequest, WriteRequest, MemStoreSize and 
> StoreFileSize are considered, But CoprocessorService requests are ignored. In 
> our KYLIN cluster, there only have CoprocessorService requests, and the 
> cluster sometimes unbalanced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20740) StochasticLoadBalancer should consider CoprocessorService request factor when computing cost

2018-06-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519749#comment-16519749
 ] 

Ted Yu commented on HBASE-20740:


Test failure doesn't seem to be related.

Can you address checkstyle warning ?

> StochasticLoadBalancer should consider CoprocessorService request factor when 
> computing cost
> 
>
> Key: HBASE-20740
> URL: https://issues.apache.org/jira/browse/HBASE-20740
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: 20740-master-v2.patch, HBASE-20740-master-v1.patch
>
>
> When compute region load cost, ReadRequest, WriteRequest, MemStoreSize and 
> StoreFileSize are considered, But CoprocessorService requests are ignored. In 
> our KYLIN cluster, there only have CoprocessorService requests, and the 
> cluster sometimes unbalanced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20769) getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl

2018-06-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519720#comment-16519720
 ] 

Ted Yu commented on HBASE-20769:


When I ran the new test without fix:
{code}
testWithMockedMapReduceWithSplitsPerRegion(org.apache.hadoop.hbase.mapreduce.TestTableSnapshotInputFormat)
  Time elapsed: 21.222 sec  <<< FAILURE!
java.lang.AssertionError: yya >= yyy?
at 
org.apache.hadoop.hbase.mapreduce.TestTableSnapshotInputFormat.verifyWithMockedMapReduce(TestTableSnapshotInputFormat.java:357)
at 
org.apache.hadoop.hbase.mapreduce.TestTableSnapshotInputFormat.testWithMockedMapReduceWithSplitsPerRegion(TestTableSnapshotInputFormat.java:269)
{code}
{code}
  Assert.assertTrue(Bytes.toStringBinary(startRow) + " <= "+ 
Bytes.toStringBinary(scan.getStartRow()) + "?", Bytes.compareTo(startRow, 
scan.getStartRow()) <= 0);
  Assert.assertTrue(Bytes.toStringBinary(stopRow) + " >= "+ 
Bytes.toStringBinary(scan.getStopRow()) + "?", Bytes.compareTo(stopRow, 
scan.getStopRow()) >= 0);
{code}
First, using '?' doesn't go with assertion - if test fails, the output should 
be definitive.
Second, please wrap long line.
{code}
+public TableSnapshotInputFormatImpl.InputSplit getDelegate() {
{code}
The above can be package private.

> getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl
> ---
>
> Key: HBASE-20769
> URL: https://issues.apache.org/jira/browse/HBASE-20769
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 2.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-20769.master.001.patch
>
>
> When numSplits > 1, getSplits may create split that has start row smaller 
> than user specified scan's start row or stop row larger than user specified 
> scan's stop row.
> {code}
> byte[][] sp = sa.split(hri.getStartKey(), hri.getEndKey(), numSplits, 
> true);
> for (int i = 0; i < sp.length - 1; i++) {
>   if (PrivateCellUtil.overlappingKeys(scan.getStartRow(), 
> scan.getStopRow(), sp[i],
>   sp[i + 1])) {
> List hosts =
> calculateLocationsForInputSplit(conf, htd, hri, tableDir, 
> localityEnabled);
> Scan boundedScan = new Scan(scan);
> boundedScan.setStartRow(sp[i]);
> boundedScan.setStopRow(sp[i + 1]);
> splits.add(new InputSplit(htd, hri, hosts, boundedScan, 
> restoreDir));
>   }
> }
> {code}
> Since we split keys by the range of regions, when sp[i] < scan.getStartRow() 
> or sp[i + 1] > scan.getStopRow(), the created bounded scan may contain range 
> that over user defined scan.
> fix should be simple:
> {code}
> boundedScan.setStartRow(
>  Bytes.compareTo(scan.getStartRow(), sp[i]) > 0 ? scan.getStartRow() : sp[i]);
>  boundedScan.setStopRow(
>  Bytes.compareTo(scan.getStopRow(), sp[i + 1]) < 0 ? scan.getStopRow() : sp[i 
> + 1]);
> {code}
> I will also try to add UTs to help discover this problem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20740) StochasticLoadBalancer should consider CoprocessorService request factor when computing cost

2018-06-21 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20740:
---
Attachment: 20740-master-v2.patch

> StochasticLoadBalancer should consider CoprocessorService request factor when 
> computing cost
> 
>
> Key: HBASE-20740
> URL: https://issues.apache.org/jira/browse/HBASE-20740
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: 20740-master-v2.patch, HBASE-20740-master-v1.patch
>
>
> When compute region load cost, ReadRequest, WriteRequest, MemStoreSize and 
> StoreFileSize are considered, But CoprocessorService requests are ignored. In 
> our KYLIN cluster, there only have CoprocessorService requests, and the 
> cluster sometimes unbalanced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-21 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20748:
---
Status: Open  (was: Patch Available)

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Assignee: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20532) Use try-with-resources in BackupSystemTable

2018-06-21 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20532:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for the patch, Andy.

Modified the subject line of your patch to match the title of JIRA.

> Use try-with-resources in BackupSystemTable
> ---
>
> Key: HBASE-20532
> URL: https://issues.apache.org/jira/browse/HBASE-20532
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andy Lin
>Assignee: Andy Lin
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HBASE-20532.v0.patch, HBASE-20532.v1.patch, 
> HBASE-20532.v2.patch
>
>
> Use try -with-resources in BackupSystemTable for describeBackupSet and 
> listBackupSets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20532) Use try-with-resources in BackupSystemTable

2018-06-21 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20532:
---
Summary: Use try-with-resources in BackupSystemTable  (was: Use try 
-with-resources in BackupSystemTable)

> Use try-with-resources in BackupSystemTable
> ---
>
> Key: HBASE-20532
> URL: https://issues.apache.org/jira/browse/HBASE-20532
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andy Lin
>Assignee: Andy Lin
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HBASE-20532.v0.patch, HBASE-20532.v1.patch, 
> HBASE-20532.v2.patch
>
>
> Use try -with-resources in BackupSystemTable for describeBackupSet and 
> listBackupSets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519544#comment-16519544
 ] 

Ted Yu edited comment on HBASE-20734 at 6/21/18 4:07 PM:
-

In WALSplitter#finishSplitLogFile :
{code}
Path rootdir = FSUtils.getWALRootDir(conf);
{code}
Probably renaming rootdir as walRootDir would make the code clearer (since 
rootdir means different directory w.r.t. {{FSUtils.getRootDir}})


was (Author: yuzhih...@gmail.com):
In WALSplitter#finishSplitLogFile :

Path rootdir = FSUtils.getWALRootDir(conf);

Probably renaming rootdir as walRootDir would make the code clearer (since 
rootdir means different directory w.r.t. {{FSUtils.getWALRootDir}})

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519544#comment-16519544
 ] 

Ted Yu commented on HBASE-20734:


In WALSplitter#finishSplitLogFile :

Path rootdir = FSUtils.getWALRootDir(conf);

Probably renaming rootdir as walRootDir would make the code clearer (since 
rootdir means different directory w.r.t. {{FSUtils.getWALRootDir}})

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20532) Use try -with-resources in BackupSystemTable

2018-06-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519274#comment-16519274
 ] 

Ted Yu commented on HBASE-20532:


lgtm

Please fix checkstyle warnings


> Use try -with-resources in BackupSystemTable
> 
>
> Key: HBASE-20532
> URL: https://issues.apache.org/jira/browse/HBASE-20532
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andy Lin
>Assignee: Andy Lin
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HBASE-20532.v0.patch, HBASE-20532.v1.patch
>
>
> Use try -with-resources in BackupSystemTable for describeBackupSet and 
> listBackupSets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20631) B: Merge command enhancements

2018-06-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519088#comment-16519088
 ] 

Ted Yu commented on HBASE-20631:


Please fix checkstyle warnings.

> B: Merge command enhancements 
> 
>
> Key: HBASE-20631
> URL: https://issues.apache.org/jira/browse/HBASE-20631
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-20631-v1.patch, HBASE-20631-v2.patch
>
>
> Currently, merge supports only list of backup ids, which users must provide. 
> Date range merges seem more convenient for users. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-15320) HBase connector for Kafka Connect

2018-06-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517749#comment-16517749
 ] 

Ted Yu commented on HBASE-15320:


Unit test encountered:
{code}
[ERROR] Caused by: 
org.apache.maven.surefire.booter.SurefireBooterForkException: There was an 
error in the forked process
[ERROR] java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.ResourceCheckerJUnitListener
{code}
Please add license header to conf/kafka-route-rules.xml

> HBase connector for Kafka Connect
> -
>
> Key: HBASE-15320
> URL: https://issues.apache.org/jira/browse/HBASE-15320
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: Andrew Purtell
>Assignee: Mike Wingert
>Priority: Major
>  Labels: beginner
> Fix For: 3.0.0
>
> Attachments: HBASE-15320.master.1.patch, HBASE-15320.master.10.patch, 
> HBASE-15320.master.2.patch, HBASE-15320.master.3.patch, 
> HBASE-15320.master.4.patch, HBASE-15320.master.5.patch, 
> HBASE-15320.master.6.patch, HBASE-15320.master.7.patch, 
> HBASE-15320.master.8.patch, HBASE-15320.master.8.patch, 
> HBASE-15320.master.9.patch, HBASE-15320.pdf, HBASE-15320.pdf
>
>
> Implement an HBase connector with source and sink tasks for the Connect 
> framework (http://docs.confluent.io/2.0.0/connect/index.html) available in 
> Kafka 0.9 and later.
> See also: 
> http://www.confluent.io/blog/announcing-kafka-connect-building-large-scale-low-latency-data-pipelines
> An HBase source 
> (http://docs.confluent.io/2.0.0/connect/devguide.html#task-example-source-task)
>  could be implemented as a replication endpoint or WALObserver, publishing 
> cluster wide change streams from the WAL to one or more topics, with 
> configurable mapping and partitioning of table changes to topics.  
> An HBase sink task 
> (http://docs.confluent.io/2.0.0/connect/devguide.html#sink-tasks) would 
> persist, with optional transformation (JSON? Avro?, map fields to native 
> schema?), Kafka SinkRecords into HBase tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20743) ASF License warnings for branch-1

2018-06-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516642#comment-16516642
 ] 

Ted Yu commented on HBASE-20743:


Yes.

Here is excerpt from target/rat.txt :
{code}
  hbase-error-prone/target/checkstyle-result.xml
  hbase-error-prone/target/maven-archiver/pom.properties
  
hbase-error-prone/target/classes/META-INF/services/com.google.errorprone.bugpatterns.BugChecker
  
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst
  
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst
{code}

> ASF License warnings for branch-1
> -
>
> Key: HBASE-20743
> URL: https://issues.apache.org/jira/browse/HBASE-20743
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> From 
> https://builds.apache.org/job/HBase%20Nightly/job/branch-1/350/artifact/output-general/patch-asflicense-problems.txt
>  :
> {code}
> Lines that start with ? in the ASF License  report indicate files that do 
> not have an Apache license header:
>  !? hbase-error-prone/target/checkstyle-result.xml
>  !? 
> hbase-error-prone/target/classes/META-INF/services/com.google.errorprone.bugpatterns.BugChecker
>  !? 
> hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst
>  !? 
> hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst
> {code}
> Looks like they should be excluded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516625#comment-16516625
 ] 

Ted Yu commented on HBASE-20542:


When I ran the test locally with patch, I saw the following in test output:
{code}
2018-06-18 20:43:32,244 ERROR [Time-limited test] regionserver.HRegion(1249): 
Asked to modify this region's 
(foobar,,1529379812191.c0c4ada07a3a9905699278a1b1fd63ff.) memStoreSizing  to a 
negative value which is incorrect. Current memStoreSizing=0, delta=-32
java.lang.Exception
  at 
org.apache.hadoop.hbase.regionserver.HRegion.checkNegativeMemStoreDataSize(HRegion.java:1249)
  at 
org.apache.hadoop.hbase.regionserver.HRegion.incMemStoreSize(HRegion.java:1229)
  at 
org.apache.hadoop.hbase.regionserver.RegionServicesForStores.addMemStoreSize(RegionServicesForStores.java:61)
  at 
org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:153)
  at 
org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:332)
  at 
org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:177)
  at 
org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:110)
  at 
org.apache.hadoop.hbase.regionserver.CompactingMemStore.inMemoryCompaction(CompactingMemStore.java:459)
  at 
org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:439)
  at 
org.apache.hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore.testTimeRange(TestCompactingToCellFlatMapMemStore.java:409)
  at 
org.apache.hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore.testTimeRangeAfterCompaction(TestCompactingToCellFlatMapMemStore.java:374)
{code}

> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch, run.sh, workloada, 
> workloadc, workloadx, workloady
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516172#comment-16516172
 ] 

Ted Yu commented on HBASE-20542:


{code}
+  public void waitForUpdates() {
+if(!updatesLock.isWriteLocked()) {
+  updatesLock.writeLock().lock();
{code}
Where is write lock released ?

> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20542) Better heap utilization for IMC with MSLABs

2018-06-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516158#comment-16516158
 ] 

Ted Yu commented on HBASE-20542:


Took a quick look.
{code}
+while(!succ) {
+  currentActive = getActive();
+  succ = preUpdate(currentActive, cell, memstoreSizing);
{code}
Potentially how many {{preUpdate}} calls would take place when there is 
contention ?
{code}
+   * @return true if the cell can be added to the
*/
   @Override
-  protected void checkActiveSize() {
-return;
+  protected boolean checkAndAddToActiveSize(MutableSegment currActive, Cell 
cellToAdd,
{code}
The javadoc for @return is incomplete.


> Better heap utilization for IMC with MSLABs
> ---
>
> Key: HBASE-20542
> URL: https://issues.apache.org/jira/browse/HBASE-20542
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Attachments: HBASE-20542.branch-2.001.patch
>
>
> Following HBASE-20188 we realized in-memory compaction combined with MSLABs 
> may suffer from heap under-utilization due to internal fragmentation. This 
> jira presents a solution to circumvent this problem. The main idea is to have 
> each update operation check if it will cause overflow in the active segment 
> *before* it is writing the new value (instead of checking the size after the 
> write is completed), and if it is then the active segment is atomically 
> swapped with a new empty segment, and is pushed (full-yet-not-overflowed) to 
> the compaction pipeline. Later on the IMC deamon will run its compaction 
> operation (flatten index/merge indices/data compaction) in the background. 
> Some subtle concurrency issues should be handled with care. We next elaborate 
> on them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515979#comment-16515979
 ] 

Ted Yu commented on HBASE-20734:


WALSplitter is annotated @InterfaceAudience.Private

It has information to both FileSystems (wal and root). By isolating the change 
to WALSplitter, there is a chance we don't need to modify Region interface.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20734:
---
Fix Version/s: 3.0.0

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
> Fix For: 3.0.0
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20208) Review of SequenceIdAccounting.java

2018-06-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515908#comment-16515908
 ] 

Ted Yu commented on HBASE-20208:


{code}
[ERROR] 
/testptch/hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceIdAccounting.java:[31,39]
 package org.apache.commons.collections4 does not exist
[INFO] 1 error
{code}

> Review of SequenceIdAccounting.java
> ---
>
> Key: HBASE-20208
> URL: https://issues.apache.org/jira/browse/HBASE-20208
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: 20208.1.patch, HBASE-20208.1.patch
>
>
> # Fix checkstyle warnings
> # Use re-usable libraries where possible
> # Improve Map Access
> What got my attention on this class was:
> {code}
> for (Map.Entry e : sequenceids.entrySet()) {
>   long oldestFlushing = Long.MAX_VALUE;
>   long oldestUnflushed = Long.MAX_VALUE;
>   if (flushing != null && flushing.containsKey(e.getKey())) {
> oldestFlushing = flushing.get(e.getKey());
>   }
>   if (unflushed != null && unflushed.containsKey(e.getKey())) {
> oldestUnflushed = unflushed.get(e.getKey());
>   }
>   long min = Math.min(oldestFlushing, oldestUnflushed);
>   if (min <= e.getValue()) {
> return false;
>   }
> {code}
> Here, the two maps are calling _containsKey_ and then _get_.  It is also 
> calling {{e.getKey()}} repeatedly.
> I propose changing this so that {{e.getKey()}} is only called once and 
> instead of looking up an entry with _containsKey_ and then a _get_, simply 
> use _get_ once and check for a 'null' value to check for existence.  It saves 
> two trips through the Map Collection on each loop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20208) Review of SequenceIdAccounting.java

2018-06-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20208:
---
Status: Open  (was: Patch Available)

> Review of SequenceIdAccounting.java
> ---
>
> Key: HBASE-20208
> URL: https://issues.apache.org/jira/browse/HBASE-20208
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: 20208.1.patch, HBASE-20208.1.patch
>
>
> # Fix checkstyle warnings
> # Use re-usable libraries where possible
> # Improve Map Access
> What got my attention on this class was:
> {code}
> for (Map.Entry e : sequenceids.entrySet()) {
>   long oldestFlushing = Long.MAX_VALUE;
>   long oldestUnflushed = Long.MAX_VALUE;
>   if (flushing != null && flushing.containsKey(e.getKey())) {
> oldestFlushing = flushing.get(e.getKey());
>   }
>   if (unflushed != null && unflushed.containsKey(e.getKey())) {
> oldestUnflushed = unflushed.get(e.getKey());
>   }
>   long min = Math.min(oldestFlushing, oldestUnflushed);
>   if (min <= e.getValue()) {
> return false;
>   }
> {code}
> Here, the two maps are calling _containsKey_ and then _get_.  It is also 
> calling {{e.getKey()}} repeatedly.
> I propose changing this so that {{e.getKey()}} is only called once and 
> instead of looking up an entry with _containsKey_ and then a _get_, simply 
> use _get_ once and check for a 'null' value to check for existence.  It saves 
> two trips through the Map Collection on each loop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515898#comment-16515898
 ] 

Ted Yu commented on HBASE-20748:


As Chia-ping commented on the PR, PR is not integrated with QA bot.
Look at the tail of 
https://builds.apache.org/job/PreCommit-HBASE-Build/13302/console to see what 
happened.

bulkLoadWithCustomVersions duplicates existing code. Please refactor the 
current bulkLoad method and include unit test in next patch.

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: spark
>Reporter: Charles PORROT
>Assignee: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515728#comment-16515728
 ] 

Ted Yu commented on HBASE-20748:


Please also add a unit test utilizing the enhancement.

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase, spark
>Reporter: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20748) HBaseContext bulkLoad: being able to use custom versions

2018-06-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515727#comment-16515727
 ] 

Ted Yu commented on HBASE-20748:


bq. method would throw back to

I think you meant 'fall back'

Your code is very similar to the current {{bulkLoad}} method.
Since hbase-spark module isn't in any hbase release, you can customize the 
existing method.

Please upload next patch a diff.

> HBaseContext bulkLoad: being able to use custom versions
> 
>
> Key: HBASE-20748
> URL: https://issues.apache.org/jira/browse/HBASE-20748
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase, spark
>Reporter: Charles PORROT
>Priority: Major
>  Labels: HBaseContext, bulkload, spark, versions
> Attachments: bulkLoadCustomVersions.scala
>
>
> The _bulkLoad_ methods of _class org.apache.hadoop.hbase.spark.HBaseContext_ 
> use the system's current time for the version of the cells to bulk-load. This 
> makes this method, and its twin _bulkLoadThinRows_, useless if you need to 
> use your own versionning system:
> {code:java}
> //Here is where we finally iterate through the data in this partition of the 
> //RDD that has been sorted and partitioned
> val wl = writeValueToHFile(
>   keyFamilyQualifier.rowKey, 
>   keyFamilyQualifier.family, 
>   keyFamilyQualifier.qualifier, 
>   cellValue, 
>   nowTimeStamp, 
>   fs, 
>   conn, 
>   localTableName, 
>   conf, 
>   familyHFileWriteOptionsMapInternal, 
>   hfileCompression, 
>   writerMap, 
>   stagingDir
> ){code}
>  
> Thus, I propose a third _bulkLoad_ method, based on the original method. 
> Instead of using an _Iterator(KeyFamilyQualifier, Array[Byte])_ as the basis 
> for the writes, this new method would use an _Iterator(KeyFamilyQualifier, 
> Array[Byte], Long_), with the _Long_ being the version.
>  
> Definition of _bulkLoad_:
> {code:java}
> def bulkLoad[T](
> rdd:RDD[T], 
> tableName: TableName, 
> flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte])], 
> stagingDir:String, 
> familyHFileWriteOptionsMap: util.Map[Array[Byte], FamilyHFileWriteOptions] = 
> new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
> compactionExclude: Boolean = false, 
> maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> Definition of a _bulkLoadWithCustomVersions_ method:
> {code:java}
> def bulkLoadCustomVersions[T](rdd:RDD[T],
>   tableName: TableName,
>   flatMap: (T) => Iterator[(KeyFamilyQualifier, Array[Byte], 
> Long)],
>   stagingDir:String,
>   familyHFileWriteOptionsMap:
>   util.Map[Array[Byte], FamilyHFileWriteOptions] =
>   new util.HashMap[Array[Byte], FamilyHFileWriteOptions],
>   compactionExclude: Boolean = false,
>   maxSize:Long = HConstants.DEFAULT_MAX_FILE_SIZE):{code}
> In case of illogical version (for instance, a negative version), the method 
> would throw back to the current timestamp.
> {code:java}
> val wl = writeValueToHFile(keyFamilyQualifier.rowKey,
>   keyFamilyQualifier.family,
>   keyFamilyQualifier.qualifier,
>   cellValue,
>   if (version > 0) version else nowTimeStamp,
>   fs,
>   conn,
>   localTableName,
>   conf,
>   familyHFileWriteOptionsMapInternal,
>   hfileCompression,
>   writerMap,
>   stagingDir){code}
> See the attached file for the file with the full proposed method.
>  
> +Edit:+
> The same could be done with bulkLoadThinRows: instead of a:
> {code:java}
> Iterator[Pair[ByteArrayWrapper, FamiliesQualifiersValues]]{code}
> We expect an:
> {code:java}
>  Iterator[Triple[ByteArrayWrapper, FamiliesQualifiersValues, Long]]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) Custom hbase.wal.dir results in data loss because we write recovered edits into a different place than where the recovering region server looks for them

2018-06-17 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 1.4.6
   2.1.0
   Status: Resolved  (was: Patch Available)

Thanks for the reviews.

> Custom hbase.wal.dir results in data loss because we write recovered edits 
> into a different place than where the recovering region server looks for them
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 2.1.0, 1.4.6
>
> Attachments: 20723.branch-1.txt, 20723.branch-2.txt, 20723.v1.txt, 
> 20723.v10.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 20723.v5.txt, 
> 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20744) Address FindBugs warnings in branch-1

2018-06-16 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20744:
--

 Summary: Address FindBugs warnings in branch-1
 Key: HBASE-20744
 URL: https://issues.apache.org/jira/browse/HBASE-20744
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/HBase%20Nightly/job/branch-1/350//JDK8_Nightly_Build_Report_(Hadoop2)/
> :
{code}
FindBugsmodule:hbase-common
Inconsistent synchronization of 
org.apache.hadoop.hbase.io.encoding.EncodedDataBlock$BufferGrabbingByteArrayOutputStream.ourBytes;
 locked 50% of time Unsynchronized access at EncodedDataBlock.java:50% of time 
Unsynchronized access at EncodedDataBlock.java:[line 258]
{code}
{code}
FindBugsmodule:hbase-hadoop2-compat
java.util.concurrent.ScheduledThreadPoolExecutor stored into non-transient 
field MetricsExecutorImpl$ExecutorSingleton.scheduler At 
MetricsExecutorImpl.java:MetricsExecutorImpl$ExecutorSingleton.scheduler At 
MetricsExecutorImpl.java:[line 51]
{code}
{code}
FindBugsmodule:hbase-server
instanceof will always return false in 
org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, 
int, int), since a org.apache.hadoop.hbase.quotas.RpcThrottlingException can't 
be a org.apache.hadoop.hbase.quotas.ThrottlingException At 
RegionServerQuotaManager.java:in 
org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, 
int, int), since a org.apache.hadoop.hbase.quotas.RpcThrottlingException can't 
be a org.apache.hadoop.hbase.quotas.ThrottlingException At 
RegionServerQuotaManager.java:[line 193]
instanceof will always return true for all non-null values in 
org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, 
int, int), since all org.apache.hadoop.hbase.quotas.RpcThrottlingException are 
instances of org.apache.hadoop.hbase.quotas.RpcThrottlingException At 
RegionServerQuotaManager.java:for all non-null values in 
org.apache.hadoop.hbase.quotas.RegionServerQuotaManager.checkQuota(Region, int, 
int, int), since all org.apache.hadoop.hbase.quotas.RpcThrottlingException are 
instances of org.apache.hadoop.hbase.quotas.RpcThrottlingException At 
RegionServerQuotaManager.java:[line 199]
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20743) ASF License warnings for branch-1

2018-06-16 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20743:
--

 Summary: ASF License warnings for branch-1
 Key: HBASE-20743
 URL: https://issues.apache.org/jira/browse/HBASE-20743
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/HBase%20Nightly/job/branch-1/350/artifact/output-general/patch-asflicense-problems.txt
> :
{code}
Lines that start with ? in the ASF License  report indicate files that do 
not have an Apache license header:
 !? hbase-error-prone/target/checkstyle-result.xml
 !? 
hbase-error-prone/target/classes/META-INF/services/com.google.errorprone.bugpatterns.BugChecker
 !? 
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst
 !? 
hbase-error-prone/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst
{code}
Looks like they should be excluded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) Custom hbase.wal.dir results in data loss because we write recovered edits into a different place than where the recovering region server looks for them

2018-06-15 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514642#comment-16514642
 ] 

Ted Yu commented on HBASE-20723:


Integrated to master and branch-1 , waiting for Jenkins result before 
integrating to other branches.

> Custom hbase.wal.dir results in data loss because we write recovered edits 
> into a different place than where the recovering region server looks for them
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.branch-1.txt, 20723.branch-2.txt, 20723.v1.txt, 
> 20723.v10.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 20723.v5.txt, 
> 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) Custom hbase.wal.dir results in data loss because we write recovered edits into a different place than where the recovering region server looks for them

2018-06-15 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514641#comment-16514641
 ] 

Ted Yu commented on HBASE-20723:


I ran TestLoadIncrementalHFiles with patch in branch-1 which passed.

Test failure was not related - the test is for bulk load.

> Custom hbase.wal.dir results in data loss because we write recovered edits 
> into a different place than where the recovering region server looks for them
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.branch-1.txt, 20723.branch-2.txt, 20723.v1.txt, 
> 20723.v10.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 20723.v5.txt, 
> 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) Custom hbase.wal.dir results in data loss because we write recovered edits into a different place than where the recovering region server looks for them

2018-06-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.branch-2.txt

> Custom hbase.wal.dir results in data loss because we write recovered edits 
> into a different place than where the recovering region server looks for them
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.branch-1.txt, 20723.branch-2.txt, 20723.v1.txt, 
> 20723.v10.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 20723.v5.txt, 
> 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) Custom hbase.wal.dir results in data loss because we write recovered edits into a different place than where the recovering region server looks for them

2018-06-15 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514534#comment-16514534
 ] 

Ted Yu commented on HBASE-20723:


Patch for branch-1 adds correction to TestWALFactory in the following form:
{code}
-HRegion region = HRegion.openHRegion(this.conf, this.fs, hbaseWALRootDir, 
hri, htd, wal);
+HRegion region = HRegion.openHRegion(this.conf, this.fs, hbaseRootDir, 
hri, htd, wal);
{code}
i.e. region should be created under rootdir, not under wal.dir

> Custom hbase.wal.dir results in data loss because we write recovered edits 
> into a different place than where the recovering region server looks for them
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.branch-1.txt, 20723.v1.txt, 20723.v10.txt, 
> 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 20723.v5.txt, 20723.v5.txt, 
> 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) Custom hbase.wal.dir results in data loss because we write recovered edits into a different place than where the recovering region server looks for them

2018-06-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.branch-1.txt

> Custom hbase.wal.dir results in data loss because we write recovered edits 
> into a different place than where the recovering region server looks for them
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.branch-1.txt, 20723.v1.txt, 20723.v10.txt, 
> 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 20723.v5.txt, 20723.v5.txt, 
> 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) Custom hbase.wal.dir results in data loss because we write recovered edits into a different place than where the recovering region server looks for them

2018-06-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Summary: Custom hbase.wal.dir results in data loss because we write 
recovered edits into a different place than where the recovering region server 
looks for them  (was: Custom hbase.wal.dir results in dataloss because we write 
recovered edits into a different place than where the recovering region server 
looks for them.)

> Custom hbase.wal.dir results in data loss because we write recovered edits 
> into a different place than where the recovering region server looks for them
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.v1.txt, 20723.v10.txt, 20723.v2.txt, 20723.v3.txt, 
> 20723.v4.txt, 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 
> 20723.v8.txt, 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) Custom hbase.wal.dir results in dataloss because we write recovered edits into a different place than where the recovering region server looks for them.

2018-06-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.v10.txt

> Custom hbase.wal.dir results in dataloss because we write recovered edits 
> into a different place than where the recovering region server looks for them.
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.v1.txt, 20723.v10.txt, 20723.v2.txt, 20723.v3.txt, 
> 20723.v4.txt, 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 
> 20723.v8.txt, 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) Custom hbase.wal.dir results in dataloss because we write recovered edits into a different place than where the recovering region server looks for them.

2018-06-15 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514438#comment-16514438
 ] 

Ted Yu commented on HBASE-20723:


Rohan and I did more testing in Azure cluster (with config as shown in the 
description).

Killing region server no longer incurs data loss with patch v10.

> Custom hbase.wal.dir results in dataloss because we write recovered edits 
> into a different place than where the recovering region server looks for them.
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.v1.txt, 20723.v10.txt, 20723.v2.txt, 20723.v3.txt, 
> 20723.v4.txt, 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 
> 20723.v8.txt, 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20740) StochasticLoadBalancer should consider Coprocessor request count when computing cost

2018-06-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20740:
---
Summary: StochasticLoadBalancer should consider Coprocessor request count 
when computing cost  (was: StochasticLoadBalancer should consider Coprocessor 
request count when compute cost)

> StochasticLoadBalancer should consider Coprocessor request count when 
> computing cost
> 
>
> Key: HBASE-20740
> URL: https://issues.apache.org/jira/browse/HBASE-20740
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: HBASE-20740-master-v1.patch
>
>
> When compute region load cost, ReadRequest, WriteRequest, MemStoreSize and 
> StoreFileSize are considered, But CoprocessorRequests are not include. In our 
> kylin cluster, there only have coprocessor requests,  and the cluster 
> sometimes unbalanced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20740) StochasticLoadBalancer should consider Coprocessor request count when computing cost

2018-06-15 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513667#comment-16513667
 ] 

Ted Yu commented on HBASE-20740:


bq. only have coprocessor requests

Please describe your use case in more detail: how does coprocessor generate 
requests (if there is no request from clients).
{code}
+   * @return the number of write requests made to region
+   */
+  public long getCpRequestCount();
{code}
javadoc says write requests but the method name doesn't reflect such (implying 
read+write). Please go over the patch and make the method name(s) match 
semantics you want.

Please upload next patch to review board.

> StochasticLoadBalancer should consider Coprocessor request count when 
> computing cost
> 
>
> Key: HBASE-20740
> URL: https://issues.apache.org/jira/browse/HBASE-20740
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: HBASE-20740-master-v1.patch
>
>
> When compute region load cost, ReadRequest, WriteRequest, MemStoreSize and 
> StoreFileSize are considered, But CoprocessorRequests are not include. In our 
> kylin cluster, there only have coprocessor requests,  and the cluster 
> sometimes unbalanced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20740) StochasticLoadBalancer should consider Coprocessor request count when compute cost

2018-06-15 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20740:
---
Status: Patch Available  (was: Open)

> StochasticLoadBalancer should consider Coprocessor request count when compute 
> cost
> --
>
> Key: HBASE-20740
> URL: https://issues.apache.org/jira/browse/HBASE-20740
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Reporter: chenxu
>Assignee: chenxu
>Priority: Major
> Attachments: HBASE-20740-master-v1.patch
>
>
> When compute region load cost, ReadRequest, WriteRequest, MemStoreSize and 
> StoreFileSize are considered, But CoprocessorRequests are not include. In our 
> kylin cluster, there only have coprocessor requests,  and the cluster 
> sometimes unbalanced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the recovered edits root path

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Summary: WALSplitter uses the rootDir, which is walDir, as the recovered 
edits root path  (was: WALSplitter uses the rootDir, which is walDir, as the 
tableDir root path.)

> WALSplitter uses the rootDir, which is walDir, as the recovered edits root 
> path
> ---
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 
> 20723.v9.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
   Priority: Critical  (was: Major)
Component/s: (was: hbase)
 wal

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 
> 20723.v9.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513205#comment-16513205
 ] 

Ted Yu commented on HBASE-20734:


After HBASE-17437 was incorporated into 1.4 release, the recovered edits dir 
can diverge from where wal.dir is configured.

HBASE-20723 fixes data loss.

This issue is to get recovered edits dir colocated with wal.dir, considering 
combinations users may already have in their deployment.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.v9.txt

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 
> 20723.v9.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: (was: 20723.v9.txt)

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 
> 20723.v9.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513150#comment-16513150
 ] 

Ted Yu commented on HBASE-20734:


Suppose the new region server only knows about the recovered edits dir under 
wal.dir, it may miss edits located under recovered edits dir under root dir.
So the new code needs to look for edits in both places.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513139#comment-16513139
 ] 

Ted Yu commented on HBASE-20723:


Patch v9 removed the FileSystem parameter to getRegionSplitEditsPath() since it 
can be obtained thru the conf.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 
> 20723.v9.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.v9.txt

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 
> 20723.v9.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513077#comment-16513077
 ] 

Ted Yu commented on HBASE-20734:


Moving recovered edits dir over to under wal.dir is the obvious action to take.

But we should consider effect on rolling upgrade, etc.

The new code needs to be able to recognize both locations.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513075#comment-16513075
 ] 

Ted Yu commented on HBASE-20723:


Patch v8 does variable renaming w.r.t. walDir.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.v8.txt

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513062#comment-16513062
 ] 

Ted Yu commented on HBASE-20723:


bq. some indication of the number of records replayed

The above metric alone may not be enough to indicate whether there was data 
loss. I suggest formulating the combination of factors for alerting the user in 
another issue.

For the last paragraph, I agree.
I already logged a JIRA - HBASE-20734

For this data loss bug, I suggest we fix in this issue since the proper 
solution to HBASE-20734 may take longer to come up.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513043#comment-16513043
 ] 

Ted Yu commented on HBASE-20723:


Patch v7 sets wal.dir to different location than rootdir in TestWALFactory

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.v7.txt

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513023#comment-16513023
 ] 

Ted Yu commented on HBASE-20723:


For #2 above, we already have JMX for :
{code}
"Replay_num_ops" : 0,
"Replay_min" : 0,
"Replay_max" : 0,
"Replay_mean" : 0.0,
"Replay_median" : 0.0,
"Replay_75th_percentile" : 0.0,
"Replay_95th_percentile" : 0.0,
"Replay_99th_percentile" : 0.0,
{code}
bq. very least be a WARN to indicate something potentially wrong happened

Once this logic error is fixed, I am not aware of other scenario where the 
message should be WARN.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20676) Give .hbase-snapshot proper ownership upon directory creation

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20676:
---
Labels: snapshot  (was: )

> Give .hbase-snapshot proper ownership upon directory creation
> -
>
> Key: HBASE-20676
> URL: https://issues.apache.org/jira/browse/HBASE-20676
> Project: HBase
>  Issue Type: Task
>Reporter: Ted Yu
>Priority: Major
>  Labels: snapshot
>
> This is continuation of the discussion over HBASE-20668.
> Tthe .hbase-snapshot directory is not created at cluster startup. Normally it 
> is created when snapshot operation is initiated.
> However, if before any snapshot operation is performed, some non-super user 
> from another cluster conducts ExportSnapshot to this cluster, the 
> .hbase-snapshot directory would be created as that user.
> (This is just one scenario that can lead to wrong ownership)
> This JIRA is to seek proper way(s) to ensure that .hbase-snapshot directory 
> would always carry proper onwership and permission upon creation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512931#comment-16512931
 ] 

Ted Yu commented on HBASE-20723:


Tried out multiwal :
{code}
diff --git 
a/hbase-server/src/test/java/org/apache/hadoop/hbase/wal/TestWALFactory.java 
b/hbase-server/src/test/java/org/apache/hadoop/hbase/wal/TestWALFactory.java
index b1fe67b..116b50b 100644
--- a/hbase-server/src/test/java/org/apache/hadoop/hbase/wal/TestWALFactory.java
+++ b/hbase-server/src/test/java/org/apache/hadoop/hbase/wal/TestWALFactory.java
@@ -127,6 +127,7 @@ public class TestWALFactory {
   @BeforeClass
   public static void setUpBeforeClass() throws Exception {
 // Make block sizes small.
+TEST_UTIL.getConfiguration().set(WALFactory.WAL_PROVIDER, "multiwal");
 TEST_UTIL.getConfiguration().setInt("dfs.blocksize", 1024 * 1024);
 // needed for testAppendClose()
 // quicker heartbeat interval for faster DN death notification
{code}
TestWALFactory#testSplit failed the same way.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512925#comment-16512925
 ] 

Ted Yu commented on HBASE-20723:


Josh:
See my earlier comment w.r.t. the incorrect calculation of table dir in 
TestWALFactory#testSplit .
Without the fix, you would see:
{code}
testSplit(org.apache.hadoop.hbase.wal.TestWALFactory)  Time elapsed: 1.55 sec  
<<< FAILURE!
java.lang.AssertionError: expected:<9> but was:<0>
at 
org.apache.hadoop.hbase.wal.TestWALFactory.verifySplits(TestWALFactory.java:333)
at 
org.apache.hadoop.hbase.wal.TestWALFactory.testSplit(TestWALFactory.java:221)
{code}
So TestWALFactory#testSplit should have helped us catch the bug.

w.r.t. multiwal, I don't think that would alter the outcome - one wal per table 
may still incur data loss (as Tak Lon described earlier) since recovered edits 
are located in a place where the current code wouldn't check.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.v6.txt

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512875#comment-16512875
 ] 

Ted Yu commented on HBASE-20723:


I ran TestZooKeeper locally with patch v5 which passed.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-14 Thread Ted Yu (JIRA)
Ted Yu created HBASE-20734:
--

 Summary: Colocate recovered edits directory with hbase.wal.dir
 Key: HBASE-20734
 URL: https://issues.apache.org/jira/browse/HBASE-20734
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu


During investigation of HBASE-20723, I realized that we wouldn't get the best 
performance when hbase.wal.dir is configured to be on different (fast) media 
than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
currently under rootdir.

Such setup may not result in fast recovery when there is region server failover.

This issue is to find proper (hopefully backward compatible) way in colocating 
recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.v5.txt

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20532) Use try -with-resources in BackupSystemTable

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512240#comment-16512240
 ] 

Ted Yu commented on HBASE-20532:


Can you see if the checkstyle warnings are related ?

> Use try -with-resources in BackupSystemTable
> 
>
> Key: HBASE-20532
> URL: https://issues.apache.org/jira/browse/HBASE-20532
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andy Lin
>Assignee: Andy Lin
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HBASE-20532.v0.patch
>
>
> Use try -with-resources in BackupSystemTable for describeBackupSet and 
> listBackupSets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512239#comment-16512239
 ] 

Ted Yu commented on HBASE-20723:


TestWALFactory passes using patch v5 where table directory is located under 
rootdir.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.v5.txt

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512232#comment-16512232
 ] 

Ted Yu commented on HBASE-20723:


Looking at the change from HBASE-17437 in TestWALFactory :
{code}
-Path tabledir = FSUtils.getTableDir(hbaseDir, tableName);
+Path tabledir = FSUtils.getTableDir(hbaseWALDir, tableName);
{code}
[~zyork]:
Can you tell me the intention of locating table directory under hbaseWALDir ?
I think table directory should be located under rootdir.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.v4.txt

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512220#comment-16512220
 ] 

Ted Yu commented on HBASE-20723:


It turns out that retrieving conf from FileSystem may deviate from the real 
root dir.
e.g.

hdfs://localhost:52551/user/tyu/test-data/47b4f42b-2ffe-497c-a63f-297b67c2bcb3

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-13 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.v3.txt

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20727) Persist FlushedSequenceId to speed up WAL split after cluster restart

2018-06-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511948#comment-16511948
 ] 

Ted Yu commented on HBASE-20727:


lgtm

nit: FlushedSequenceIdFlusher class can be private.

> Persist FlushedSequenceId to speed up WAL split after cluster restart
> -
>
> Key: HBASE-20727
> URL: https://issues.apache.org/jira/browse/HBASE-20727
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20727.002.patch, HBASE-20727.patch
>
>
> We use flushedSequenceIdByRegion and storeFlushedSequenceIdsByRegion in 
> ServerManager to record the latest flushed seqids of regions and stores. So 
> during log split, we can use seqids stored in those maps to filter out the 
> edits which do not need to be replayed. But, those maps are not persisted. 
> After cluster restart or master restart, info of flushed seqids are all lost. 
> Here I offer a way to persist those info to HDFS, even if master restart, we 
> can still use those info to filter WAL edits and then to speed up replay.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-13 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Attachment: 20723.v2.txt

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-13 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20723:
---
Status: Patch Available  (was: Open)

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20729) B BackupLogCleaner must ignore ProcV2 WAL files

2018-06-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511931#comment-16511931
 ] 

Ted Yu commented on HBASE-20729:


lgtm

Please remove unused imports.

> B BackupLogCleaner must ignore ProcV2 WAL files
> -
>
> Key: HBASE-20729
> URL: https://issues.apache.org/jira/browse/HBASE-20729
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-20729-v1.patch
>
>
> These are WAL files B does need for backup. The issue does not affect B 
> functionality though. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20630) B: Delete command enhancements

2018-06-13 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-20630:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> B: Delete command enhancements
> 
>
> Key: HBASE-20630
> URL: https://issues.apache.org/jira/browse/HBASE-20630
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20630-v1.patch, HBASE-20630-v2.patch, 
> HBASE-20630-v3.patch, HBASE-20630-v4.patch
>
>
> Make the command more useable. Currently, user needs to provide list of 
> backup ids to delete. It would be nice to have more convenient options, such 
> as: deleting all backups which are older than XXX days, etc 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    6   7   8   9   10   11   12   13   14   15   >