[jira] [Commented] (HBASE-15583) Any HTD we give out should be immutable

2017-03-23 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939905#comment-15939905
 ] 

Chia-Ping Tsai commented on HBASE-15583:


TestRegionReplicaFailover pass locally.
Would you please take a look? [~stack] 

> Any HTD we give out should be immutable
> ---
>
> Key: HBASE-15583
> URL: https://issues.apache.org/jira/browse/HBASE-15583
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Gabor Liptak
>Assignee: Chia-Ping Tsai
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-15583.v0.patch, HBASE-15583.v1.patch
>
>
> From [~enis] in https://issues.apache.org/jira/browse/HBASE-15505:
> PS Should UnmodifyableHTableDescriptor be renamed to 
> UnmodifiableHTableDescriptor?
> It should be named ImmutableHTableDescriptor to be consistent with 
> collections naming. Let's do this as a subtask of the parent jira, not here. 
> Thinking about it though, why would we return an Immutable HTD in 
> HTable.getTableDescriptor() versus a mutable HTD in 
> Admin.getTableDescriptor(). It does not make sense. Should we just get rid of 
> the Immutable ones?
> We also have UnmodifyableHRegionInfo which is not used at the moment it 
> seems. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17817) Make Regionservers log which tables it removed coprocessors from when aborting

2017-03-23 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939903#comment-15939903
 ] 

Anoop Sam John commented on HBASE-17817:


When it is Exceptions from RegionObserver CPs, we can say the table info also. 
But we do have WALObserver, MasterObserver etc which is not really tied with 
tables.  Ya it will be better to say the table info when get Exceptions from 
region level CPs

> Make Regionservers log which tables it removed coprocessors from when aborting
> --
>
> Key: HBASE-17817
> URL: https://issues.apache.org/jira/browse/HBASE-17817
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors, regionserver
>Affects Versions: 1.1.2
>Reporter: Steen Manniche
>  Labels: logging
>
> When a coprocessor throws a runtime exception (e.g. NPE), the regionserver 
> handles this according to {{hbase.coprocessor.abortonerror}}.
> The output in the logs give no indication as to which table the coprocessor 
> was removed from (or which version, or jarfile is the culprit). This causes 
> longer debugging and recovery times.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17831) Support small scan in thrift2

2017-03-23 Thread Guangxu Cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangxu Cheng updated HBASE-17831:
--
Attachment: HBASE-17831-branch-1.patch

> Support small scan in thrift2
> -
>
> Key: HBASE-17831
> URL: https://issues.apache.org/jira/browse/HBASE-17831
> Project: HBase
>  Issue Type: Improvement
>  Components: Thrift
>Reporter: Guangxu Cheng
> Attachments: HBASE-17831-branch-1.patch
>
>
> Support small scan in thrift2



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17831) Support small scan in thrift2

2017-03-23 Thread Guangxu Cheng (JIRA)
Guangxu Cheng created HBASE-17831:
-

 Summary: Support small scan in thrift2
 Key: HBASE-17831
 URL: https://issues.apache.org/jira/browse/HBASE-17831
 Project: HBase
  Issue Type: Improvement
  Components: Thrift
Reporter: Guangxu Cheng


Support small scan in thrift2



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17828) Remove IS annotations from IA.Public classes

2017-03-23 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939860#comment-15939860
 ] 

Duo Zhang commented on HBASE-17828:
---

Yes.

But I think there are still some concerns as in the mailing list [~jerryhe] 
mentioned that spark uses IS annotations to describe the stability of public 
API?

> Remove IS annotations from IA.Public classes
> 
>
> Key: HBASE-17828
> URL: https://issues.apache.org/jira/browse/HBASE-17828
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>
> As discussed in the dev mailing list, we do not mention the IS annotations 
> for public API in our refguide so the IS annotations for IA.Public classes 
> only makes people confusing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17829) [C++] Update for simple_client

2017-03-23 Thread Sudeep Sunthankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudeep Sunthankar updated HBASE-17829:
--
Attachment: HBASE-17829.HBASE-14850.v1.patch

Adding AsyncCall() to make RPC calls as simple-client binary was failing.

Thanks

> [C++] Update for simple_client
> --
>
> Key: HBASE-17829
> URL: https://issues.apache.org/jira/browse/HBASE-17829
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Sudeep Sunthankar
> Attachments: HBASE-17829.HBASE-14850.v1.patch
>
>
> simple_client binary is failing. This patch has a fix for this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17828) Remove IS annotations from IA.Public classes

2017-03-23 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939803#comment-15939803
 ] 

Yu Li commented on HBASE-17828:
---

So this JIRA will cover necessary changes on both source codes and refguide, 
right?

> Remove IS annotations from IA.Public classes
> 
>
> Key: HBASE-17828
> URL: https://issues.apache.org/jira/browse/HBASE-17828
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>
> As discussed in the dev mailing list, we do not mention the IS annotations 
> for public API in our refguide so the IS annotations for IA.Public classes 
> only makes people confusing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17830) [C++] Test Util support for standlone HBase instance

2017-03-23 Thread Sudeep Sunthankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudeep Sunthankar updated HBASE-17830:
--
Attachment: HBASE-17830.HBASE-14850.v1.patch

This patch adds methods to run a standalone HBase instance.
Thanks.

> [C++] Test Util support for standlone HBase instance
> 
>
> Key: HBASE-17830
> URL: https://issues.apache.org/jira/browse/HBASE-17830
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Sudeep Sunthankar
> Attachments: HBASE-17830.HBASE-14850.v1.patch
>
>
> Running standalone instance was removed from TestUtil after introduction of 
> mini cluster. We are re-introducing methods to run a standalone instance if 
> reqd.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-17829) [C++] Update for simple_client

2017-03-23 Thread Sudeep Sunthankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudeep Sunthankar reassigned HBASE-17829:
-

Assignee: Sudeep Sunthankar

> [C++] Update for simple_client
> --
>
> Key: HBASE-17829
> URL: https://issues.apache.org/jira/browse/HBASE-17829
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Sudeep Sunthankar
>
> simple_client binary is failing. This patch has a fix for this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-17830) [C++] Test Util support for standlone HBase instance

2017-03-23 Thread Sudeep Sunthankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudeep Sunthankar reassigned HBASE-17830:
-

Assignee: Sudeep Sunthankar

> [C++] Test Util support for standlone HBase instance
> 
>
> Key: HBASE-17830
> URL: https://issues.apache.org/jira/browse/HBASE-17830
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Sudeep Sunthankar
>
> Running standalone instance was removed from TestUtil after introduction of 
> mini cluster. We are re-introducing methods to run a standalone instance if 
> reqd.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17830) [C++] Test Util support for standlone HBase instance

2017-03-23 Thread Sudeep Sunthankar (JIRA)
Sudeep Sunthankar created HBASE-17830:
-

 Summary: [C++] Test Util support for standlone HBase instance
 Key: HBASE-17830
 URL: https://issues.apache.org/jira/browse/HBASE-17830
 Project: HBase
  Issue Type: Sub-task
Reporter: Sudeep Sunthankar


Running standalone instance was removed from TestUtil after introduction of 
mini cluster. We are re-introducing methods to run a standalone instance if 
reqd.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17829) [C++] Update for simple_client

2017-03-23 Thread Sudeep Sunthankar (JIRA)
Sudeep Sunthankar created HBASE-17829:
-

 Summary: [C++] Update for simple_client
 Key: HBASE-17829
 URL: https://issues.apache.org/jira/browse/HBASE-17829
 Project: HBase
  Issue Type: Sub-task
Reporter: Sudeep Sunthankar


simple_client binary is failing. This patch has a fix for this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17765) Reviving the merge possibility in the CompactingMemStore

2017-03-23 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939746#comment-15939746
 ] 

Anoop Sam John commented on HBASE-17765:


Patch looks good.
{code}
public static final String COMPACTING_MEMSTORE_THRESHOLD_KEY =
51"hbase.hregion.compacting.memstore.threshold";
{code}
Am not sure whether this name is correct.  This is a threshold value for 
#segments above which there will be in memory merge.  So seeing this one may 
not get what threshold it is.  Compacting memstore will have threshold for in 
memory flush also (Segment flush to pipeline)/  So pls give a better/clear name.

> Reviving the merge possibility in the CompactingMemStore
> 
>
> Key: HBASE-17765
> URL: https://issues.apache.org/jira/browse/HBASE-17765
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>Assignee: Anastasia Braginsky
> Fix For: 2.0.0
>
> Attachments: HBASE-17765-V01.patch, HBASE-17765-V02.patch, 
> HBASE-17765-V03.patch
>
>
> According to the new performance results presented in the HBASE-16417 we see 
> that the read latency of the 90th percentile of the BASIC policy is too big 
> due to the need to traverse through too many segments in the pipeline. In 
> this JIRA we correct the bug in the merge sizing calculations and allow 
> pipeline size threshold to be a configurable parameter.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15314) Allow more than one backing file in bucketcache

2017-03-23 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939639#comment-15939639
 ] 

Anoop Sam John commented on HBASE-15314:


+1 to backport this to branch-1

> Allow more than one backing file in bucketcache
> ---
>
> Key: HBASE-15314
> URL: https://issues.apache.org/jira/browse/HBASE-15314
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: chunhui shen
> Fix For: 2.0
>
> Attachments: FileIOEngine.java, HBASE-15314.master.001.patch, 
> HBASE-15314.master.001.patch, HBASE-15314.patch, HBASE-15314-v2.patch, 
> HBASE-15314-v3.patch, HBASE-15314-v4.patch, HBASE-15314-v5.patch, 
> HBASE-15314-v6.patch, HBASE-15314-v7.patch, HBASE-15314-v8.patch
>
>
> Allow bucketcache use more than just one backing file: e.g. chassis has more 
> than one SSD in it.
> Usage (Setting the following configurations in hbase-site.xml):
> {quote}
> 
>   hbase.bucketcache.ioengine
>   
> files:/mnt/disk1/bucketcache,/mnt/disk2/bucketcache,/mnt/disk3/bucketcache,/mnt/disk4/bucketcache
> 
> 
>   hbase.bucketcache.size
>   1048576
> 
> {quote}
> The above setting means the total capacity of cache is 1048576MB(1TB), each 
> file length will be set to 0.25TB.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-15894) Put Object

2017-03-23 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939582#comment-15939582
 ] 

Enis Soztutar commented on HBASE-15894:
---

[~sudeeps] can you please take a look at this and HBASE-16365 whenever you have 
some time. Thanks. 

> Put Object
> --
>
> Key: HBASE-15894
> URL: https://issues.apache.org/jira/browse/HBASE-15894
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Enis Soztutar
> Attachments: HBASE-15894.HBASE-14850.v1.patch, hbase-15894-v1.patch
>
>
> Patch for creating Put objects. Put object so created can be passed to the 
> Table implementation for inserting data.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15894) Put Object

2017-03-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-15894:
--
Attachment: hbase-15894-v1.patch

v1 patch. Mutation and Put, as well as bunch of needed stuff. 

> Put Object
> --
>
> Key: HBASE-15894
> URL: https://issues.apache.org/jira/browse/HBASE-15894
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Enis Soztutar
> Attachments: HBASE-15894.HBASE-14850.v1.patch, hbase-15894-v1.patch
>
>
> Patch for creating Put objects. Put object so created can be passed to the 
> Table implementation for inserting data.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-16365) Hook up Put work flow

2017-03-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-16365:
--
Attachment: hbase-16365_v0.patch

v0 patch. Single puts are working. Need to add some unit tests before v1. 
Depends on the {{Put}} patch (HBASE-15894). 

> Hook up Put work flow
> -
>
> Key: HBASE-16365
> URL: https://issues.apache.org/jira/browse/HBASE-16365
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Elliott Clark
>Assignee: Enis Soztutar
> Attachments: hbase-16365_v0.patch
>
>
> Make Puts end to end (table -> async table -> rpc). 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-16365) Hook up Put work flow

2017-03-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar reassigned HBASE-16365:
-

Assignee: Enis Soztutar  (was: Elliott Clark)

> Hook up Put work flow
> -
>
> Key: HBASE-16365
> URL: https://issues.apache.org/jira/browse/HBASE-16365
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Elliott Clark
>Assignee: Enis Soztutar
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-16365) Hook up Put work flow

2017-03-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-16365:
--
Description: Make Puts end to end (table -> async table -> rpc). 

> Hook up Put work flow
> -
>
> Key: HBASE-16365
> URL: https://issues.apache.org/jira/browse/HBASE-16365
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Elliott Clark
>Assignee: Enis Soztutar
>
> Make Puts end to end (table -> async table -> rpc). 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-15894) Put Object

2017-03-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar reassigned HBASE-15894:
-

Assignee: Enis Soztutar

> Put Object
> --
>
> Key: HBASE-15894
> URL: https://issues.apache.org/jira/browse/HBASE-15894
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Enis Soztutar
> Attachments: HBASE-15894.HBASE-14850.v1.patch
>
>
> Patch for creating Put objects. Put object so created can be passed to the 
> Table implementation for inserting data.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15533) Add RSGroup Favored Balancer

2017-03-23 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HBASE-15533:
-
Attachment: HBASE-15533.patch

Including FavoredRSGroup based balancer. This is built on top of 
FavoredStochasticBalancer from HBASE-16942. Once HBASE-16942 gets in, will 
submit this patch for precommit builds.

> Add RSGroup Favored Balancer
> 
>
> Key: HBASE-15533
> URL: https://issues.apache.org/jira/browse/HBASE-15533
> Project: HBase
>  Issue Type: Sub-task
>  Components: FavoredNodes
>Reporter: Francis Liu
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0
>
> Attachments: HBASE-15533.patch, HBASE-15533.rough.draft.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17821) The CompoundConfiguration#toString is wrong

2017-03-23 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939568#comment-15939568
 ] 

Chia-Ping Tsai commented on HBASE-17821:


LGTM. +1

> The CompoundConfiguration#toString is wrong
> ---
>
> Key: HBASE-17821
> URL: https://issues.apache.org/jira/browse/HBASE-17821
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Yi Liang
>Priority: Trivial
>  Labels: beginner
> Attachments: HBase-17821-V1.patch
>
>
> Find this bug when reading code. We dont use the API, so it is a trivial bug.
> sb.append(this.configs); -> sb.append(m);
> {noformat}
>   @Override
>   public String toString() {
> StringBuffer sb = new StringBuffer();
> sb.append("CompoundConfiguration: " + this.configs.size() + " configs");
> for (ImmutableConfigMap m : this.configs) {
>   sb.append(this.configs);
> }
> return sb.toString();
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15533) Add RSGroup Favored Balancer

2017-03-23 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HBASE-15533:
-
Summary: Add RSGroup Favored Balancer  (was: RSGroup related favored nodes 
enhancements)

> Add RSGroup Favored Balancer
> 
>
> Key: HBASE-15533
> URL: https://issues.apache.org/jira/browse/HBASE-15533
> Project: HBase
>  Issue Type: Sub-task
>  Components: FavoredNodes
>Reporter: Francis Liu
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0
>
> Attachments: HBASE-15533.rough.draft.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17828) Remove IS annotations from IA.Public classes

2017-03-23 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-17828:
-

 Summary: Remove IS annotations from IA.Public classes
 Key: HBASE-17828
 URL: https://issues.apache.org/jira/browse/HBASE-17828
 Project: HBase
  Issue Type: Bug
Reporter: Duo Zhang


As discussed in the dev mailing list, we do not mention the IS annotations for 
public API in our refguide so the IS annotations for IA.Public classes only 
makes people confusing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-16438) Create a cell type so that chunk id is embedded in it

2017-03-23 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939526#comment-15939526
 ] 

Yu Li commented on HBASE-16438:
---

bq. Not clearing/removing from Q is not the real issue...That is why that jira 
changed it to keep only chunks from pool.
Exactly.

bq. But as such there wont be OOME for sure.
Yes, with fix on memstore size accounting in HBASE-16194, won't be OOME


bq. Now we will keep ref to Chunks as long as the MSLAB, which created it, is 
not closed... But to have a better GC in this scenario, HBASE-16195 would have 
helped , which will be broken by this jira.
Yes, this is a true concern...

> Create a cell type so that chunk id is embedded in it
> -
>
> Key: HBASE-16438
> URL: https://issues.apache.org/jira/browse/HBASE-16438
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-16438_1.patch, 
> HBASE-16438_3_ChunkCreatorwrappingChunkPool.patch, 
> HBASE-16438_4_ChunkCreatorwrappingChunkPool.patch, HBASE-16438.patch, 
> MemstoreChunkCell_memstoreChunkCreator_oldversion.patch, 
> MemstoreChunkCell_trunk.patch
>
>
> For CellChunkMap we may need a cell such that the chunk out of which it was 
> created, the id of the chunk be embedded in it so that when doing flattening 
> we can use the chunk id as a meta data. More details will follow once the 
> initial tasks are completed. 
> Why we need to embed the chunkid in the Cell is described by [~anastas] in 
> this remark over in parent issue 
> https://issues.apache.org/jira/browse/HBASE-14921?focusedCommentId=15244119&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15244119



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17816) HRegion#mutateRowWithLocks should update writeRequestCount metric

2017-03-23 Thread Ashu Pachauri (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939483#comment-15939483
 ] 

Ashu Pachauri commented on HBASE-17816:
---

[~Weizhan Zeng] It's matter of semantics, as to what we count as a write 
request, a request that was issued or a request that succeeded. If you look at 
other parts in the code, we count any request that reaches HRegion, whether 
it's successful or not. For example, look at HRegion#batchMutate.

> HRegion#mutateRowWithLocks should update writeRequestCount metric
> -
>
> Key: HBASE-17816
> URL: https://issues.apache.org/jira/browse/HBASE-17816
> Project: HBase
>  Issue Type: Bug
>Reporter: Ashu Pachauri
>Assignee: Ashu Pachauri
> Attachments: HBASE-17816.master.001.patch
>
>
> Currently, all the calls that use HRegion#mutateRowWithLocks miss 
> writeRequestCount metric. The mutateRowWithLocks base method should update 
> the metric.
> Examples are checkAndMutate calls through RSRpcServices#multi, 
> Region#mutateRow api , MultiRowMutationProcessor coprocessor endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17827) Client tools relying on AuthUtil.getAuthChore() break credential cache login

2017-03-23 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939473#comment-15939473
 ] 

Jerry He commented on HBASE-17827:
--

I think either of your approaches looks fine.
Are you still going to use the same chore mechanism to re-login, from cache, 
even the cache has limited lifetime?

> Client tools relying on AuthUtil.getAuthChore() break credential cache login
> 
>
> Key: HBASE-17827
> URL: https://issues.apache.org/jira/browse/HBASE-17827
> Project: HBase
>  Issue Type: Bug
>  Components: canary, security
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Critical
>
> Client tools, such as Canary, which make use of keytab based logins with 
> AuthUtil.getAuthChore() do not allow any way to continue without a 
> keytab-based login when security is enabled.  Currently, when security is 
> enabled and the configuration lacks {{hbase.client.keytab.file}}, these tools 
> would fail with:
> {noformat}
> ERROR hbase.AuthUtil: Error while trying to perform the initial login: 
> Running in secure mode, but config doesn't have a keytab
> java.io.IOException: Running in secure mode, but config doesn't have a keytab
> at 
> org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
> at org.apache.hadoop.hbase.security.User.login(User.java:258)
> at 
> org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
> at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
> at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
> Exception in thread "main" java.io.IOException: Running in secure mode, but 
> config doesn't have a keytab
> at 
> org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
> at org.apache.hadoop.hbase.security.User.login(User.java:258)
> at 
> org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
> at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
> at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
> {noformat}
> These tools should still work with the default credential-cache login, at 
> least when a client keytab is not configured.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939444#comment-15939444
 ] 

Hadoop QA commented on HBASE-17287:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 3m 3s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 10s 
{color} | {color:blue} The patch file was not named according to hbase's naming 
conventions. Please see 
https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for 
instructions. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 
7s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
41s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
41s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 18s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 99m 6s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 141m 8s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8d52d23 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12860230/17287.master.v3.txt |
| JIRA Issue | HBASE-17287 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 4c315fa47d51 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / f1c1f25 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6209/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6209/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Master becomes a zombie if filesystem object closes
> ---

[jira] [Commented] (HBASE-15314) Allow more than one backing file in bucketcache

2017-03-23 Thread Zach York (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939356#comment-15939356
 ] 

Zach York commented on HBASE-15314:
---

[~zjushch] Any chance we can get this backported to branch-1? Otherwise, I can 
try to take a look soon.

> Allow more than one backing file in bucketcache
> ---
>
> Key: HBASE-15314
> URL: https://issues.apache.org/jira/browse/HBASE-15314
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: chunhui shen
> Fix For: 2.0
>
> Attachments: FileIOEngine.java, HBASE-15314.master.001.patch, 
> HBASE-15314.master.001.patch, HBASE-15314.patch, HBASE-15314-v2.patch, 
> HBASE-15314-v3.patch, HBASE-15314-v4.patch, HBASE-15314-v5.patch, 
> HBASE-15314-v6.patch, HBASE-15314-v7.patch, HBASE-15314-v8.patch
>
>
> Allow bucketcache use more than just one backing file: e.g. chassis has more 
> than one SSD in it.
> Usage (Setting the following configurations in hbase-site.xml):
> {quote}
> 
>   hbase.bucketcache.ioengine
>   
> files:/mnt/disk1/bucketcache,/mnt/disk2/bucketcache,/mnt/disk3/bucketcache,/mnt/disk4/bucketcache
> 
> 
>   hbase.bucketcache.size
>   1048576
> 
> {quote}
> The above setting means the total capacity of cache is 1048576MB(1TB), each 
> file length will be set to 0.25TB.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17453) add Ping into HBase server for deprecated GetProtocolVersion

2017-03-23 Thread Tianying Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939355#comment-15939355
 ] 

Tianying Chang commented on HBASE-17453:


[~saint@gmail.com] One more question, I can see those method from 
Admin.proto has higher priority. But what is the criteria to decide what kind 
of methods should go to Admin.proto vs Client.proto. It seems all of them are 
implemented at RSRpcService.java any ways. 

> add Ping into HBase server for deprecated GetProtocolVersion
> 
>
> Key: HBASE-17453
> URL: https://issues.apache.org/jira/browse/HBASE-17453
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.2.2
>Reporter: Tianying Chang
>Assignee: Tianying Chang
>Priority: Minor
> Fix For: 2.0.0, 1.2.5
>
> Attachments: HBASE-17453-1.2.patch, 
> HBASE-17453-master-fixWhiteSpace.patch, HBASE-17453-master.patch, 
> HBASE-17453-master-v1.patch, HBASE-17453-master-v2.patch
>
>
> Our HBase service is hosted in AWS. We saw cases where the connection between 
> the client (Asynchbase in our case) and server stop working but did not throw 
> any exception, therefore traffic stuck. So we added a "Ping" feature in 
> AsyncHBase 1.5 by utilizing the GetProtocolVersion() API provided at RS side, 
> if no traffic for given time, we send the "Ping", if no response back for 
> "Ping", we assume the connect is bad and reconnect. 
> Now we are upgrading cluster from 94 to 1.2. However, GetProtocolVersion() is 
> deprecated. To be able to support same detect/reconnect feature, we added 
> Ping() in our internal HBase 1.2 branch, and also patched accordingly in 
> Asynchbase 1.7.
> We would like to open source this feature since it is useful for use case in 
> AWS environment. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-17287:
---
Fix Version/s: 2.0
   1.4.0

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
> Fix For: 1.4.0, 2.0
>
> Attachments: 17287.master.v2.txt, 17287.master.v3.txt, 17287.v2.txt
>
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-17287:
---
Attachment: 17287.master.v3.txt

In patch v3, I use checkFileSystem() to see if filesystem is available.

If filesystem is not available, abort server.

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
> Attachments: 17287.master.v2.txt, 17287.master.v3.txt, 17287.v2.txt
>
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HBASE-17827) Client tools relying on AuthUtil.getAuthChore() break credential cache login

2017-03-23 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939210#comment-15939210
 ] 

Gary Helmling edited comment on HBASE-17827 at 3/23/17 8:53 PM:


bq. counter argument (that I don't think I agree with): AuthUtil.getAuthChore 
is meant for long running applications and as such users shouldn't be deploying 
applications that are based on it using a local kinit that will inevitably fail 
once renewal lifetimes are exceeded.

Yeah, there's certainly an argument there, which is useful in thinking about 
how to approach this.  My first take is if hbase.client.keytab.file is not 
configured or is empty, to log a warning and fall back to the credential cache 
behavior.  The log would at least give an indication on what it's doing, with 
instructions on what to configure for keytab logins.

The other approach I can think of is to require a config property to be set to 
override the keytab login.  So rather than the keytab config being missing (or 
overridden) in the config, you have to set say 
hbase.client.security.ccache=true, in which case getAuthChore() could skip the 
keytab login.

My use case was wanting to run the Canary tool as a different user with a 
credential cache (and on a different host without the keytab file) in order to 
test access.  So I think either of these would work for me.

Our only internal use of AuthUtil.getAuthChore() is in IntegrationTestBase and 
Canary.  But since AuthUtil is now part of the public API, we also need to 
consider if the current behavior is something users may be relying on.  If so, 
then I think the second approach better retains that compatibility, but I'm 
open to either.


was (Author: ghelmling):
bq. counter argument (that I don't think I agree with): AuthUtil.getAuthChore 
is meant for long running applications and as such users shouldn't be deploying 
applications that are based on it using a local kinit that will inevitably fail 
once renewal lifetimes are exceeded.

Yeah, there's certainly an argument there, which is useful in thinking about 
how to approach this.  My first take is if hbase.client.keytab.file is not 
configured or is empty, to log a warning and fall back to the credential cache 
behavior.  The log would at least give an indication on what it's doing, with 
instructions on what to configure for keytab logins.

The other approach I can think of is to require a config property to be set to 
override the keytab login.  So rather than the keytab config being missing (or 
overridden) in the config, you have to set say 
hbase.client.security.ccache=true, in which case getAuthChore() could skip the 
keytab login.

My use case was wanting to run the Canary tool as a different user with a 
credential cache (and on a different host without the keytab file) in order to 
test access.  So I think either of these would work for me.

Our only internal use of AuthUtil.getAuthChore() is in IntegrationTestBase and 
Canary.  But since AuthUtil is now part of the public API, we also need to 
consider which if the current behavior is something users may be relying on.  
If so, then I think the second approach better retains that compatibility, but 
I'm open to either.

> Client tools relying on AuthUtil.getAuthChore() break credential cache login
> 
>
> Key: HBASE-17827
> URL: https://issues.apache.org/jira/browse/HBASE-17827
> Project: HBase
>  Issue Type: Bug
>  Components: canary, security
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Critical
>
> Client tools, such as Canary, which make use of keytab based logins with 
> AuthUtil.getAuthChore() do not allow any way to continue without a 
> keytab-based login when security is enabled.  Currently, when security is 
> enabled and the configuration lacks {{hbase.client.keytab.file}}, these tools 
> would fail with:
> {noformat}
> ERROR hbase.AuthUtil: Error while trying to perform the initial login: 
> Running in secure mode, but config doesn't have a keytab
> java.io.IOException: Running in secure mode, but config doesn't have a keytab
> at 
> org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
> at org.apache.hadoop.hbase.security.User.login(User.java:258)
> at 
> org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
> at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
> at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
> Exception in threa

[jira] [Commented] (HBASE-17827) Client tools relying on AuthUtil.getAuthChore() break credential cache login

2017-03-23 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939210#comment-15939210
 ] 

Gary Helmling commented on HBASE-17827:
---

bq. counter argument (that I don't think I agree with): AuthUtil.getAuthChore 
is meant for long running applications and as such users shouldn't be deploying 
applications that are based on it using a local kinit that will inevitably fail 
once renewal lifetimes are exceeded.

Yeah, there's certainly an argument there, which is useful in thinking about 
how to approach this.  My first take is if hbase.client.keytab.file is not 
configured or is empty, to log a warning and fall back to the credential cache 
behavior.  The log would at least give an indication on what it's doing, with 
instructions on what to configure for keytab logins.

The other approach I can think of is to require a config property to be set to 
override the keytab login.  So rather than the keytab config being missing (or 
overridden) in the config, you have to set say 
hbase.client.security.ccache=true, in which case getAuthChore() could skip the 
keytab login.

My use case was wanting to run the Canary tool as a different user with a 
credential cache (and on a different host without the keytab file) in order to 
test access.  So I think either of these would work for me.

Our only internal use of AuthUtil.getAuthChore() is in IntegrationTestBase and 
Canary.  But since AuthUtil is now part of the public API, we also need to 
consider which if the current behavior is something users may be relying on.  
If so, then I think the second approach better retains that compatibility, but 
I'm open to either.

> Client tools relying on AuthUtil.getAuthChore() break credential cache login
> 
>
> Key: HBASE-17827
> URL: https://issues.apache.org/jira/browse/HBASE-17827
> Project: HBase
>  Issue Type: Bug
>  Components: canary, security
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Critical
>
> Client tools, such as Canary, which make use of keytab based logins with 
> AuthUtil.getAuthChore() do not allow any way to continue without a 
> keytab-based login when security is enabled.  Currently, when security is 
> enabled and the configuration lacks {{hbase.client.keytab.file}}, these tools 
> would fail with:
> {noformat}
> ERROR hbase.AuthUtil: Error while trying to perform the initial login: 
> Running in secure mode, but config doesn't have a keytab
> java.io.IOException: Running in secure mode, but config doesn't have a keytab
> at 
> org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
> at org.apache.hadoop.hbase.security.User.login(User.java:258)
> at 
> org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
> at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
> at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
> Exception in thread "main" java.io.IOException: Running in secure mode, but 
> config doesn't have a keytab
> at 
> org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
> at org.apache.hadoop.hbase.security.User.login(User.java:258)
> at 
> org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
> at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
> at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
> {noformat}
> These tools should still work with the default credential-cache login, at 
> least when a client keytab is not configured.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17821) The CompoundConfiguration#toString is wrong

2017-03-23 Thread Yi Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liang updated HBASE-17821:
-
Status: Patch Available  (was: Open)

> The CompoundConfiguration#toString is wrong
> ---
>
> Key: HBASE-17821
> URL: https://issues.apache.org/jira/browse/HBASE-17821
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Priority: Trivial
>  Labels: beginner
> Attachments: HBase-17821-V1.patch
>
>
> Find this bug when reading code. We dont use the API, so it is a trivial bug.
> sb.append(this.configs); -> sb.append(m);
> {noformat}
>   @Override
>   public String toString() {
> StringBuffer sb = new StringBuffer();
> sb.append("CompoundConfiguration: " + this.configs.size() + " configs");
> for (ImmutableConfigMap m : this.configs) {
>   sb.append(this.configs);
> }
> return sb.toString();
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17821) The CompoundConfiguration#toString is wrong

2017-03-23 Thread Yi Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liang updated HBASE-17821:
-
Attachment: HBase-17821-V1.patch

> The CompoundConfiguration#toString is wrong
> ---
>
> Key: HBASE-17821
> URL: https://issues.apache.org/jira/browse/HBASE-17821
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Priority: Trivial
>  Labels: beginner
> Attachments: HBase-17821-V1.patch
>
>
> Find this bug when reading code. We dont use the API, so it is a trivial bug.
> sb.append(this.configs); -> sb.append(m);
> {noformat}
>   @Override
>   public String toString() {
> StringBuffer sb = new StringBuffer();
> sb.append("CompoundConfiguration: " + this.configs.size() + " configs");
> for (ImmutableConfigMap m : this.configs) {
>   sb.append(this.configs);
> }
> return sb.toString();
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-17821) The CompoundConfiguration#toString is wrong

2017-03-23 Thread Yi Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liang reassigned HBASE-17821:


Assignee: Yi Liang

> The CompoundConfiguration#toString is wrong
> ---
>
> Key: HBASE-17821
> URL: https://issues.apache.org/jira/browse/HBASE-17821
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Yi Liang
>Priority: Trivial
>  Labels: beginner
> Attachments: HBase-17821-V1.patch
>
>
> Find this bug when reading code. We dont use the API, so it is a trivial bug.
> sb.append(this.configs); -> sb.append(m);
> {noformat}
>   @Override
>   public String toString() {
> StringBuffer sb = new StringBuffer();
> sb.append("CompoundConfiguration: " + this.configs.size() + " configs");
> for (ImmutableConfigMap m : this.configs) {
>   sb.append(this.configs);
> }
> return sb.toString();
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17821) The CompoundConfiguration#toString is wrong

2017-03-23 Thread Yi Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939168#comment-15939168
 ] 

Yi Liang commented on HBASE-17821:
--

Hi Chai-ping, 
   Just saw this jira, and provide a patch. 

> The CompoundConfiguration#toString is wrong
> ---
>
> Key: HBASE-17821
> URL: https://issues.apache.org/jira/browse/HBASE-17821
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Priority: Trivial
>  Labels: beginner
>
> Find this bug when reading code. We dont use the API, so it is a trivial bug.
> sb.append(this.configs); -> sb.append(m);
> {noformat}
>   @Override
>   public String toString() {
> StringBuffer sb = new StringBuffer();
> sb.append("CompoundConfiguration: " + this.configs.size() + " configs");
> for (ImmutableConfigMap m : this.configs) {
>   sb.append(this.configs);
> }
> return sb.toString();
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17821) The CompoundConfiguration#toString is wrong

2017-03-23 Thread Yi Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liang updated HBASE-17821:
-
Attachment: HBase-17821-V1.patch

> The CompoundConfiguration#toString is wrong
> ---
>
> Key: HBASE-17821
> URL: https://issues.apache.org/jira/browse/HBASE-17821
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Priority: Trivial
>  Labels: beginner
>
> Find this bug when reading code. We dont use the API, so it is a trivial bug.
> sb.append(this.configs); -> sb.append(m);
> {noformat}
>   @Override
>   public String toString() {
> StringBuffer sb = new StringBuffer();
> sb.append("CompoundConfiguration: " + this.configs.size() + " configs");
> for (ImmutableConfigMap m : this.configs) {
>   sb.append(this.configs);
> }
> return sb.toString();
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17821) The CompoundConfiguration#toString is wrong

2017-03-23 Thread Yi Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liang updated HBASE-17821:
-
Attachment: (was: HBase-17821-V1.patch)

> The CompoundConfiguration#toString is wrong
> ---
>
> Key: HBASE-17821
> URL: https://issues.apache.org/jira/browse/HBASE-17821
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Priority: Trivial
>  Labels: beginner
>
> Find this bug when reading code. We dont use the API, so it is a trivial bug.
> sb.append(this.configs); -> sb.append(m);
> {noformat}
>   @Override
>   public String toString() {
> StringBuffer sb = new StringBuffer();
> sb.append("CompoundConfiguration: " + this.configs.size() + " configs");
> for (ImmutableConfigMap m : this.configs) {
>   sb.append(this.configs);
> }
> return sb.toString();
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14141) HBase Backup/Restore Phase 3: Filter WALs on backup to include only edits from backup tables

2017-03-23 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939163#comment-15939163
 ] 

Vladimir Rodionov commented on HBASE-14141:
---

Ping [~te...@apache.org]

> HBase Backup/Restore Phase 3: Filter WALs on backup to include only edits 
> from backup tables
> 
>
> Key: HBASE-14141
> URL: https://issues.apache.org/jira/browse/HBASE-14141
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Blocker
>  Labels: backup
> Fix For: HBASE-7912
>
> Attachments: HBASE-14141.HBASE-14123.v1.patch, HBASE-14141.v1.patch, 
> HBASE-14141.v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939153#comment-15939153
 ] 

Hadoop QA commented on HBASE-17287:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 4s 
{color} | {color:blue} The patch file was not named according to hbase's naming 
conventions. Please see 
https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for 
instructions. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
9s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
46s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
35s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
25m 48s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 102m 35s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 143m 30s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.13.1 Server=1.13.1 Image:yetus/hbase:8d52d23 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12860192/17287.master.v2.txt |
| JIRA Issue | HBASE-17287 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux ba94894d1a06 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / f1c1f25 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6207/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6207/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Master becomes a zombie if filesystem object closes
> 

[jira] [Commented] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939141#comment-15939141
 ] 

Ted Yu commented on HBASE-17287:


Unfortunately there is no dedicated subclass of IOE which expresses the close 
of filesystem

See the following hdfs tests which look for "Filesystem closed" :

http://pastebin.com/m1Ax5E2H

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
> Attachments: 17287.master.v2.txt, 17287.v2.txt
>
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939131#comment-15939131
 ] 

Devaraj Das commented on HBASE-17287:
-

The approach seems brittle - doing string checks on exceptions. I am hoping 
there is a better way to address it?

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
> Attachments: 17287.master.v2.txt, 17287.v2.txt
>
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17827) Client tools relying on AuthUtil.getAuthChore() break credential cache login

2017-03-23 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939105#comment-15939105
 ] 

Sean Busbey commented on HBASE-17827:
-

counter argument (that I don't think I agree with): AuthUtil.getAuthChore is 
meant for long running applications and as such users shouldn't be deploying 
applications that are based on it using a local kinit that will inevitably fail 
once renewal lifetimes are exceeded.

> Client tools relying on AuthUtil.getAuthChore() break credential cache login
> 
>
> Key: HBASE-17827
> URL: https://issues.apache.org/jira/browse/HBASE-17827
> Project: HBase
>  Issue Type: Bug
>  Components: canary, security
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Critical
>
> Client tools, such as Canary, which make use of keytab based logins with 
> AuthUtil.getAuthChore() do not allow any way to continue without a 
> keytab-based login when security is enabled.  Currently, when security is 
> enabled and the configuration lacks {{hbase.client.keytab.file}}, these tools 
> would fail with:
> {noformat}
> ERROR hbase.AuthUtil: Error while trying to perform the initial login: 
> Running in secure mode, but config doesn't have a keytab
> java.io.IOException: Running in secure mode, but config doesn't have a keytab
> at 
> org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
> at org.apache.hadoop.hbase.security.User.login(User.java:258)
> at 
> org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
> at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
> at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
> Exception in thread "main" java.io.IOException: Running in secure mode, but 
> config doesn't have a keytab
> at 
> org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
> at org.apache.hadoop.hbase.security.User.login(User.java:258)
> at 
> org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
> at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
> at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
> {noformat}
> These tools should still work with the default credential-cache login, at 
> least when a client keytab is not configured.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17827) Client tools relying on AuthUtil.getAuthChore() break credential cache login

2017-03-23 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939099#comment-15939099
 ] 

Sean Busbey commented on HBASE-17827:
-

Since {{AuthUtil.getAuthChore}} is our preferred way for downstream application 
auth, I'm moving this to critical. We should make sure the solution includes a 
docs update that explains the fall-back.

> Client tools relying on AuthUtil.getAuthChore() break credential cache login
> 
>
> Key: HBASE-17827
> URL: https://issues.apache.org/jira/browse/HBASE-17827
> Project: HBase
>  Issue Type: Bug
>  Components: canary, security
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Critical
>
> Client tools, such as Canary, which make use of keytab based logins with 
> AuthUtil.getAuthChore() do not allow any way to continue without a 
> keytab-based login when security is enabled.  Currently, when security is 
> enabled and the configuration lacks {{hbase.client.keytab.file}}, these tools 
> would fail with:
> {noformat}
> ERROR hbase.AuthUtil: Error while trying to perform the initial login: 
> Running in secure mode, but config doesn't have a keytab
> java.io.IOException: Running in secure mode, but config doesn't have a keytab
> at 
> org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
> at org.apache.hadoop.hbase.security.User.login(User.java:258)
> at 
> org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
> at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
> at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
> Exception in thread "main" java.io.IOException: Running in secure mode, but 
> config doesn't have a keytab
> at 
> org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
> at org.apache.hadoop.hbase.security.User.login(User.java:258)
> at 
> org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
> at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
> at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
> {noformat}
> These tools should still work with the default credential-cache login, at 
> least when a client keytab is not configured.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17827) Client tools relying on AuthUtil.getAuthChore() break credential cache login

2017-03-23 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-17827:

Priority: Critical  (was: Major)

> Client tools relying on AuthUtil.getAuthChore() break credential cache login
> 
>
> Key: HBASE-17827
> URL: https://issues.apache.org/jira/browse/HBASE-17827
> Project: HBase
>  Issue Type: Bug
>  Components: canary, security
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Critical
>
> Client tools, such as Canary, which make use of keytab based logins with 
> AuthUtil.getAuthChore() do not allow any way to continue without a 
> keytab-based login when security is enabled.  Currently, when security is 
> enabled and the configuration lacks {{hbase.client.keytab.file}}, these tools 
> would fail with:
> {noformat}
> ERROR hbase.AuthUtil: Error while trying to perform the initial login: 
> Running in secure mode, but config doesn't have a keytab
> java.io.IOException: Running in secure mode, but config doesn't have a keytab
> at 
> org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
> at org.apache.hadoop.hbase.security.User.login(User.java:258)
> at 
> org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
> at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
> at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
> Exception in thread "main" java.io.IOException: Running in secure mode, but 
> config doesn't have a keytab
> at 
> org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
> at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
> at org.apache.hadoop.hbase.security.User.login(User.java:258)
> at 
> org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
> at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
> at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
> {noformat}
> These tools should still work with the default credential-cache login, at 
> least when a client keytab is not configured.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17827) Client tools relying on AuthUtil.getAuthChore() break credential cache login

2017-03-23 Thread Gary Helmling (JIRA)
Gary Helmling created HBASE-17827:
-

 Summary: Client tools relying on AuthUtil.getAuthChore() break 
credential cache login
 Key: HBASE-17827
 URL: https://issues.apache.org/jira/browse/HBASE-17827
 Project: HBase
  Issue Type: Bug
  Components: canary, security
Reporter: Gary Helmling
Assignee: Gary Helmling


Client tools, such as Canary, which make use of keytab based logins with 
AuthUtil.getAuthChore() do not allow any way to continue without a keytab-based 
login when security is enabled.  Currently, when security is enabled and the 
configuration lacks {{hbase.client.keytab.file}}, these tools would fail with:

{noformat}
ERROR hbase.AuthUtil: Error while trying to perform the initial login: Running 
in secure mode, but config doesn't have a keytab
java.io.IOException: Running in secure mode, but config doesn't have a keytab
at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
at 
org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
at org.apache.hadoop.hbase.security.User.login(User.java:258)
at 
org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
Exception in thread "main" java.io.IOException: Running in secure mode, but 
config doesn't have a keytab
at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:239)
at 
org.apache.hadoop.hbase.security.User$SecureHadoopUser.login(User.java:420)
at org.apache.hadoop.hbase.security.User.login(User.java:258)
at 
org.apache.hadoop.hbase.security.UserProvider.login(UserProvider.java:197)
at org.apache.hadoop.hbase.AuthUtil.getAuthChore(AuthUtil.java:98)
at org.apache.hadoop.hbase.tool.Canary.run(Canary.java:589)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.tool.Canary.main(Canary.java:1327)
{noformat}

These tools should still work with the default credential-cache login, at least 
when a client keytab is not configured.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-03-23 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938970#comment-15938970
 ] 

Enis Soztutar commented on HBASE-17707:
---

Sorry, I meant replacing the line with: 
{code}
  return raw == 0 ? 0 : .1 + .9 * raw;
{code}

> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it swaps regions around 
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with 
> 100s of TBs of data and 100s of tables across dozens of servers and found 
> both to be very performant and accurate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-17287:
---
Attachment: 17287.master.v2.txt

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
> Attachments: 17287.master.v2.txt, 17287.v2.txt
>
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-17287:
---
Status: Patch Available  (was: Open)

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
> Attachments: 17287.master.v2.txt, 17287.v2.txt
>
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17453) add Ping into HBase server for deprecated GetProtocolVersion

2017-03-23 Thread Tianying Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938876#comment-15938876
 ] 

Tianying Chang commented on HBASE-17453:


[~saint@gmail.com]  Actually, as long as the method is in RSRpcServices, it 
serves my purpose, since I just need to test the connection with the specific 
RS and reconnect if needed, to make sure other operations like get, or mutate 
can succeed. It is just in AsynchHBase, it only has 
Client.proto/Cell.proto/HBase.proto/RPC.proto, no Admin.Protos. it is just 
grouping/naming by Asynchbase. I will move the Ping API into Admin.proto in the 
server patch,(no need to make my Asynchbase client side code change at all) and 
regenerate the patch.   

> add Ping into HBase server for deprecated GetProtocolVersion
> 
>
> Key: HBASE-17453
> URL: https://issues.apache.org/jira/browse/HBASE-17453
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.2.2
>Reporter: Tianying Chang
>Assignee: Tianying Chang
>Priority: Minor
> Fix For: 2.0.0, 1.2.5
>
> Attachments: HBASE-17453-1.2.patch, 
> HBASE-17453-master-fixWhiteSpace.patch, HBASE-17453-master.patch, 
> HBASE-17453-master-v1.patch, HBASE-17453-master-v2.patch
>
>
> Our HBase service is hosted in AWS. We saw cases where the connection between 
> the client (Asynchbase in our case) and server stop working but did not throw 
> any exception, therefore traffic stuck. So we added a "Ping" feature in 
> AsyncHBase 1.5 by utilizing the GetProtocolVersion() API provided at RS side, 
> if no traffic for given time, we send the "Ping", if no response back for 
> "Ping", we assume the connect is bad and reconnect. 
> Now we are upgrading cluster from 94 to 1.2. However, GetProtocolVersion() is 
> deprecated. To be able to support same detect/reconnect feature, we added 
> Ping() in our internal HBase 1.2 branch, and also patched accordingly in 
> Asynchbase 1.7.
> We would like to open source this feature since it is useful for use case in 
> AWS environment. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading

2017-03-23 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938860#comment-15938860
 ] 

Vladimir Rodionov commented on HBASE-14417:
---

[~tedyu], please do not commit until I finish RB. Thanks.

> Incremental backup and bulk loading
> ---
>
> Key: HBASE-14417
> URL: https://issues.apache.org/jira/browse/HBASE-14417
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Ted Yu
>Priority: Blocker
>  Labels: backup
> Fix For: 2.0
>
> Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, 
> 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, 
> 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, 
> 14417-tbl-ext.v9.txt, 14417.v11.txt, 14417.v13.txt, 14417.v1.txt, 
> 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, 14417.v2.txt, 
> 14417.v6.txt
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading 
> bypasses WALs for obvious reasons, breaking incremental backups. The only way 
> to continue backups after bulk loading is to create new full backup of a 
> table. This may not be feasible for customers who do bulk loading regularly 
> (say, every day).
> Here is the review board (out of date):
> https://reviews.apache.org/r/54258/
> In order not to miss the hfiles which are loaded into region directories in a 
> situation where postBulkLoadHFile() hook is not called (bulk load being 
> interrupted), we record hfile names thru preCommitStoreFile() hook.
> At time of incremental backup, we check the presence of such hfiles. If they 
> are present, they become part of the incremental backup image.
> Here is review board:
> https://reviews.apache.org/r/57790/
> Google doc for design:
> https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading

2017-03-23 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938772#comment-15938772
 ] 

Josh Elser commented on HBASE-14417:


Passing over the +1 from RB.

Also, mind the whitespace error on commit.

> Incremental backup and bulk loading
> ---
>
> Key: HBASE-14417
> URL: https://issues.apache.org/jira/browse/HBASE-14417
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Ted Yu
>Priority: Blocker
>  Labels: backup
> Fix For: 2.0
>
> Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, 
> 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, 
> 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, 
> 14417-tbl-ext.v9.txt, 14417.v11.txt, 14417.v13.txt, 14417.v1.txt, 
> 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, 14417.v2.txt, 
> 14417.v6.txt
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading 
> bypasses WALs for obvious reasons, breaking incremental backups. The only way 
> to continue backups after bulk loading is to create new full backup of a 
> table. This may not be feasible for customers who do bulk loading regularly 
> (say, every day).
> Here is the review board (out of date):
> https://reviews.apache.org/r/54258/
> In order not to miss the hfiles which are loaded into region directories in a 
> situation where postBulkLoadHFile() hook is not called (bulk load being 
> interrupted), we record hfile names thru preCommitStoreFile() hook.
> At time of incremental backup, we check the presence of such hfiles. If they 
> are present, they become part of the incremental backup image.
> Here is review board:
> https://reviews.apache.org/r/57790/
> Google doc for design:
> https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17826) Backup: submitting M/R job to a particular Yarn queue

2017-03-23 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-17826:
-

 Summary: Backup: submitting M/R job to a particular Yarn queue
 Key: HBASE-17826
 URL: https://issues.apache.org/jira/browse/HBASE-17826
 Project: HBase
  Issue Type: Improvement
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


We need this to be configurable. Currently, all M/R jobs are submitted to a 
default queue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15331) HBase Backup/Restore Phase 2: Optimized Restore operation

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-15331:
--
Fix Version/s: 2.0.0

> HBase Backup/Restore Phase 2: Optimized Restore operation
> -
>
> Key: HBASE-15331
> URL: https://issues.apache.org/jira/browse/HBASE-15331
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> The current implementation for restore uses WALReplay M/R job. This has 
> performance and stability problems, since it uses HBase client API to insert 
> data. We have to migrate to bulk load approach: generate hfiles directly from 
> snapshot and incremental images. We run separate M/R job for every backup 
> image between last FULL backup and current incremental backup we restore to 
> and for every table in the list (image). If we have 10 tables and 30 days of 
> incremental backup images - this results in 30x10 = 300 M/R jobs. MUST be 
> optimized.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17825) Backup: further optimizations

2017-03-23 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-17825:
-

 Summary: Backup: further optimizations
 Key: HBASE-17825
 URL: https://issues.apache.org/jira/browse/HBASE-17825
 Project: HBase
  Issue Type: Improvement
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


Some phases of backup and restore can be optimized:

# WALPlayer support for multiple tables
# Run DistCp once per all tables during backup/restore

The eventual goal:

# 2 M/R jobs per backup/restore



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17824) Add test for multiple RS per host support

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-17824:
--
Fix Version/s: 2.0.0

> Add test for multiple RS per host support
> -
>
> Key: HBASE-17824
> URL: https://issues.apache.org/jira/browse/HBASE-17824
> Project: HBase
>  Issue Type: Test
>Reporter: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17824) Add test for multiple RS per host support

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-17824:
--
Labels: backup  (was: )

> Add test for multiple RS per host support
> -
>
> Key: HBASE-17824
> URL: https://issues.apache.org/jira/browse/HBASE-17824
> Project: HBase
>  Issue Type: Test
>Reporter: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17824) Add test for multiple RS per host support

2017-03-23 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-17824:
-

 Summary: Add test for multiple RS per host support
 Key: HBASE-17824
 URL: https://issues.apache.org/jira/browse/HBASE-17824
 Project: HBase
  Issue Type: Test
Reporter: Vladimir Rodionov






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15331) HBase Backup/Restore Phase 2: Optimized Restore operation

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-15331:
--
Labels: backup  (was: )

> HBase Backup/Restore Phase 2: Optimized Restore operation
> -
>
> Key: HBASE-15331
> URL: https://issues.apache.org/jira/browse/HBASE-15331
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>
> The current implementation for restore uses WALReplay M/R job. This has 
> performance and stability problems, since it uses HBase client API to insert 
> data. We have to migrate to bulk load approach: generate hfiles directly from 
> snapshot and incremental images. We run separate M/R job for every backup 
> image between last FULL backup and current incremental backup we restore to 
> and for every table in the list (image). If we have 10 tables and 30 days of 
> incremental backup images - this results in 30x10 = 300 M/R jobs. MUST be 
> optimized.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14441) HBase Backup/Restore Phase 2: Multiple RS per host support

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14441:
--
Labels: backup  (was: )

> HBase Backup/Restore Phase 2: Multiple RS per host support
> --
>
> Key: HBASE-14441
> URL: https://issues.apache.org/jira/browse/HBASE-14441
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14137) HBase Backup/Restore Phase 2: Backup throttling

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14137:
--
Fix Version/s: 2.0.0

> HBase Backup/Restore Phase 2: Backup throttling
> ---
>
> Key: HBASE-14137
> URL: https://issues.apache.org/jira/browse/HBASE-14137
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>
> ExportSnapshot/DistCp supports IO throttling per map task - this needs to be 
> exposed to backup utility command line tool. Backups must not interfere with 
> regular HBase cluster operations. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14137) HBase Backup/Restore Phase 2: Backup throttling

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14137:
--
Labels: backup  (was: )

> HBase Backup/Restore Phase 2: Backup throttling
> ---
>
> Key: HBASE-14137
> URL: https://issues.apache.org/jira/browse/HBASE-14137
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>
> ExportSnapshot/DistCp supports IO throttling per map task - this needs to be 
> exposed to backup utility command line tool. Backups must not interfere with 
> regular HBase cluster operations. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14134) HBase Backup/Restore Phase 2: Backup sets management

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14134:
--
Fix Version/s: 2.0.0

> HBase Backup/Restore Phase 2: Backup sets management
> 
>
> Key: HBASE-14134
> URL: https://issues.apache.org/jira/browse/HBASE-14134
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14132) HBase Backup/Restore Phase 2: History of backups

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14132:
--
Fix Version/s: 2.0.0

> HBase Backup/Restore Phase 2: History of backups
> 
>
> Key: HBASE-14132
> URL: https://issues.apache.org/jira/browse/HBASE-14132
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14134) HBase Backup/Restore Phase 2: Backup sets management

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14134:
--
Labels: backup  (was: )

> HBase Backup/Restore Phase 2: Backup sets management
> 
>
> Key: HBASE-14134
> URL: https://issues.apache.org/jira/browse/HBASE-14134
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14133) HBase Backup/Restore Phase 2: Status (and progress) of backup request

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14133:
--
Labels: backup  (was: )

> HBase Backup/Restore Phase 2: Status (and progress) of backup request
> -
>
> Key: HBASE-14133
> URL: https://issues.apache.org/jira/browse/HBASE-14133
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14133) HBase Backup/Restore Phase 2: Status (and progress) of backup request

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14133:
--
Fix Version/s: 2.0.0

> HBase Backup/Restore Phase 2: Status (and progress) of backup request
> -
>
> Key: HBASE-14133
> URL: https://issues.apache.org/jira/browse/HBASE-14133
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14132) HBase Backup/Restore Phase 2: History of backups

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14132:
--
Labels: backup  (was: )

> HBase Backup/Restore Phase 2: History of backups
> 
>
> Key: HBASE-14132
> URL: https://issues.apache.org/jira/browse/HBASE-14132
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14131) HBase Backup/Restore Phase 2: Describe backup image

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14131:
--
Labels: backup  (was: )

> HBase Backup/Restore Phase 2: Describe backup image
> ---
>
> Key: HBASE-14131
> URL: https://issues.apache.org/jira/browse/HBASE-14131
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14131) HBase Backup/Restore Phase 2: Describe backup image

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14131:
--
Fix Version/s: 2.0.0

> HBase Backup/Restore Phase 2: Describe backup image
> ---
>
> Key: HBASE-14131
> URL: https://issues.apache.org/jira/browse/HBASE-14131
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14125) HBase Backup/Restore Phase 2: Cancel backup

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14125:
--
Fix Version/s: 2.0.0

> HBase Backup/Restore Phase 2: Cancel backup
> ---
>
> Key: HBASE-14125
> URL: https://issues.apache.org/jira/browse/HBASE-14125
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>
> Cancel backup operation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14125) HBase Backup/Restore Phase 2: Cancel backup

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14125:
--
Labels: backup  (was: )

> HBase Backup/Restore Phase 2: Cancel backup
> ---
>
> Key: HBASE-14125
> URL: https://issues.apache.org/jira/browse/HBASE-14125
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>
> Cancel backup operation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14130) HBase Backup/Restore Phase 2: Delete backup image

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14130:
--
Labels: backup  (was: )

> HBase Backup/Restore Phase 2: Delete backup image
> -
>
> Key: HBASE-14130
> URL: https://issues.apache.org/jira/browse/HBASE-14130
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14130) HBase Backup/Restore Phase 2: Delete backup image

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14130:
--
Fix Version/s: 2.0.0

> HBase Backup/Restore Phase 2: Delete backup image
> -
>
> Key: HBASE-14130
> URL: https://issues.apache.org/jira/browse/HBASE-14130
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15443) HBase Backup Phase 2: Multiple backup destinations support (data model)

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-15443:
--
Labels: backup  (was: )

> HBase Backup Phase 2: Multiple backup destinations support (data model)
> ---
>
> Key: HBASE-15443
> URL: https://issues.apache.org/jira/browse/HBASE-15443
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>
> The current implementation (implicitly) implies the single backup destination 
> (when we make decision in BackupLogCleaner if WAL file is eligible for 
> deletion we do not check if WAL file has been copied over to ALL possible 
> backup destinations)
> This JIRA is to make Phase 2 data structure/model/layout to be compatible 
> with this feature in a future. The feature itself is going to be part of 
> Phase 3. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15443) HBase Backup Phase 2: Multiple backup destinations support (data model)

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-15443:
--
Fix Version/s: 2.0.0

> HBase Backup Phase 2: Multiple backup destinations support (data model)
> ---
>
> Key: HBASE-15443
> URL: https://issues.apache.org/jira/browse/HBASE-15443
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>
> The current implementation (implicitly) implies the single backup destination 
> (when we make decision in BackupLogCleaner if WAL file is eligible for 
> deletion we do not check if WAL file has been copied over to ALL possible 
> backup destinations)
> This JIRA is to make Phase 2 data structure/model/layout to be compatible 
> with this feature in a future. The feature itself is going to be part of 
> Phase 3. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14124) Failed backup is not handled properly in incremental mode

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14124:
--
Labels: backup  (was: backups)

> Failed backup is not handled properly in incremental mode
> -
>
> Key: HBASE-14124
> URL: https://issues.apache.org/jira/browse/HBASE-14124
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
>
> BackupHandler failedBackup method does not clean failed incremental backup 
> artefacts on HDFS (and in HBase).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14124) Failed backup is not handled properly in incremental mode

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14124:
--
Labels: backups  (was: )

> Failed backup is not handled properly in incremental mode
> -
>
> Key: HBASE-14124
> URL: https://issues.apache.org/jira/browse/HBASE-14124
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>  Labels: backups
> Fix For: 2.0.0
>
>
> BackupHandler failedBackup method does not clean failed incremental backup 
> artefacts on HDFS (and in HBase).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15442) HBase Backup Phase 2: Potential data loss and or data duplication in incremental backup

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-15442:
--
Labels: backup  (was: )

> HBase Backup Phase 2: Potential data loss and or data duplication in 
> incremental backup
> ---
>
> Key: HBASE-15442
> URL: https://issues.apache.org/jira/browse/HBASE-15442
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: backup
> Fix For: 2.0.0
>
>
> Suppose we have two tables T1 and T2
> # Create full backup T1 with backup id = B1
> # Create full backup T2 backupId = B2
> # New data arrived into file WAL1
> # Create incremental backup of T1 with backupId = B3
> # Create incremental backup of T2 with backupid = B4
> The directory structure for backup site after this steps
> BACKUP_ROOT/WALs/B3
> BACKUP_ROOT/WALs/B4
> BACKUP_ROOT/T1/B1
> BACKUP_ROOT/T2/B2
> File WAL1 may end up either in BACKUP_ROOT/WALs/B3 or in both: 
> BACKUP_ROOT/WALs/B3 and BACKUP_ROOT/WALs/B4 location. Both are bad: in first 
> case we lose data for backup B4 in second case we have duplicate copies of 
> WAL1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-15442) HBase Backup Phase 2: Potential data loss and or data duplication in incremental backup

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-15442:
--
Fix Version/s: 2.0.0

> HBase Backup Phase 2: Potential data loss and or data duplication in 
> incremental backup
> ---
>
> Key: HBASE-15442
> URL: https://issues.apache.org/jira/browse/HBASE-15442
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
> Fix For: 2.0.0
>
>
> Suppose we have two tables T1 and T2
> # Create full backup T1 with backup id = B1
> # Create full backup T2 backupId = B2
> # New data arrived into file WAL1
> # Create incremental backup of T1 with backupId = B3
> # Create incremental backup of T2 with backupid = B4
> The directory structure for backup site after this steps
> BACKUP_ROOT/WALs/B3
> BACKUP_ROOT/WALs/B4
> BACKUP_ROOT/T1/B1
> BACKUP_ROOT/T2/B2
> File WAL1 may end up either in BACKUP_ROOT/WALs/B3 or in both: 
> BACKUP_ROOT/WALs/B3 and BACKUP_ROOT/WALs/B4 location. Both are bad: in first 
> case we lose data for backup B4 in second case we have duplicate copies of 
> WAL1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14124) Failed backup is not handled properly in incremental mode

2017-03-23 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14124:
--
Fix Version/s: 2.0.0

> Failed backup is not handled properly in incremental mode
> -
>
> Key: HBASE-14124
> URL: https://issues.apache.org/jira/browse/HBASE-14124
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> BackupHandler failedBackup method does not clean failed incremental backup 
> artefacts on HDFS (and in HBase).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17595) Add partial result support for small/limited scan

2017-03-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938699#comment-15938699
 ] 

Hudson commented on HBASE-17595:


SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #2725 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/2725/])
HBASE-17595 addendum fix the problem for mayHaveMoreCellsInRow (zhangduo: rev 
f1c1f258e5b2dee152a46bd7f6887e928e6a6b3e)
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AllowPartialScanResultCache.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java
* (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestRawAsyncTableLimitedScanWithFilter.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/ColumnCountOnRowFilter.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScanResultCache.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/AbstractTestAsyncTableScan.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestLimitedScanWithFilter.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/CompleteScanResultCache.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScannerContext.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncScanSingleRegionRpcRetryingCaller.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/BatchScanResultCache.java


> Add partial result support for small/limited scan
> -
>
> Key: HBASE-17595
> URL: https://issues.apache.org/jira/browse/HBASE-17595
> Project: HBase
>  Issue Type: Sub-task
>  Components: asyncclient, Client, scan
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17595-addendum.patch, 
> HBASE-17595-addendum-v1.patch, HBASE-17595-addendum-v2.patch, 
> HBASE-17595-addendum-v3.patch, HBASE-17595-branch-1-addendum.patch, 
> HBASE-17595-branch-1.patch, HBASE-17595.patch, HBASE-17595-v1.patch
>
>
> The partial result support is marked as a 'TODO' when implementing 
> HBASE-17045. And when implementing HBASE-17508, we found that if we make 
> small scan share the same logic with general scan, the scan request other 
> than open scanner will not have the small flag so the server may return  
> partial result to the client and cause some strange behavior. It is solved by 
> modifying the logic at server side, but this means the 1.4.x client is not 
> safe to contact with earlier 1.x server. So we'd better address the problem 
> at client side. Marked as blocker as this issue should be finished before any 
> 2.x and 1.4.x releases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17823) Migrate to Apache Yetus Audience Annotations

2017-03-23 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938668#comment-15938668
 ] 

Sean Busbey commented on HBASE-17823:
-

linking our recent changes to the annotations module to make sure those same 
things aren't needed over in yetus.

> Migrate to Apache Yetus Audience Annotations
> 
>
> Key: HBASE-17823
> URL: https://issues.apache.org/jira/browse/HBASE-17823
> Project: HBase
>  Issue Type: Improvement
>  Components: API
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
> Fix For: 2.0.0
>
>
> Migrate from our own audience annotation handling to apache yetus' 
> implementation.
> [discussion thread on 
> dev@hbase|https://lists.apache.org/thread.html/5a83d37c9c763b3fc4114231489a073167ac69dbade9774af5ca4fb4@%3Cdev.hbase.apache.org%3E]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17823) Migrate to Apache Yetus Audience Annotations

2017-03-23 Thread Sean Busbey (JIRA)
Sean Busbey created HBASE-17823:
---

 Summary: Migrate to Apache Yetus Audience Annotations
 Key: HBASE-17823
 URL: https://issues.apache.org/jira/browse/HBASE-17823
 Project: HBase
  Issue Type: Improvement
  Components: API
Affects Versions: 2.0.0
Reporter: Sean Busbey
Assignee: Sean Busbey
 Fix For: 2.0.0


Migrate from our own audience annotation handling to apache yetus' 
implementation.

[discussion thread on 
dev@hbase|https://lists.apache.org/thread.html/5a83d37c9c763b3fc4114231489a073167ac69dbade9774af5ca4fb4@%3Cdev.hbase.apache.org%3E]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14614) Procedure v2: Core Assignment Manager

2017-03-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938624#comment-15938624
 ] 

stack commented on HBASE-14614:
---

Made branch HBASE-14614. Put up a jenkins job to build on checkin: 
https://builds.apache.org/view/H-L/view/HBase/job/HBase-HBASE-14614/1/console

> Procedure v2: Core Assignment Manager
> -
>
> Key: HBASE-14614
> URL: https://issues.apache.org/jira/browse/HBASE-14614
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: Stephen Yuan Jiang
>Assignee: Matteo Bertozzi
> Fix For: 2.0.0
>
> Attachments: HBASE-14614.master.001.patch, 
> HBASE-14614.master.002.patch, HBASE-14614.master.003.patch, 
> HBASE-14614.master.004.patch, HBASE-14614.master.005.patch, 
> HBASE-14614.master.006.patch, HBASE-14614.master.007.patch, 
> HBASE-14614.master.008.patch, HBASE-14614.master.009.patch, 
> HBASE-14614.master.010.patch, HBASE-14614.master.011.patch, 
> HBASE-14614.master.012.patch, HBASE-14614.master.012.patch, 
> HBASE-14614.master.013.patch, HBASE-14614.master.014.patch, 
> HBASE-14614.master.015.patch, HBASE-14614.master.016.patch
>
>
> New AssignmentManager implemented using proc-v2.
>  - AssignProcedure handle assignment operation
>  - UnassignProcedure handle unassign operation
>  - MoveRegionProcedure handle move/balance operation
> Concurrent Assign operations are batched together and sent to the balancer
> Concurrent Assign and Unassign operation ready to be sent to the RS are 
> batched together
> This patch is an intermediate state where we add the new AM as 
> AssignmentManager2() to the master, to be reached by tests. but the new AM 
> will not be integrated with the rest of the system. Only new am unit-tests 
> will exercise the new assigment manager. The integration with the master code 
> is part of HBASE-14616



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-17529) MergeTableRegionsProcedure failed due to ArrayIndexOutOfBoundsException

2017-03-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-17529:
--

Assignee: (was: Ted Yu)

> MergeTableRegionsProcedure failed due to ArrayIndexOutOfBoundsException
> ---
>
> Key: HBASE-17529
> URL: https://issues.apache.org/jira/browse/HBASE-17529
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>  Labels: rpc
> Attachments: 17529-master.log
>
>
> I built tar ball using master branch based on commit 
> 616f4801b06a8427a03ceca9fb8345700ce1ad71.
> Was running the following command:
> hbase org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList 
> -DinMemoryCompaction=BASIC Loop 4 6 100 /tmp/hbase-biglinkedlist-verify 6 
> --monkey slowDeterministic
> Here was related snippet:
> {code}
> 2017-01-24 21:29:00,107 DEBUG 
> [RpcServer.deafult.FPBQ.Fifo.handler=0,queue=0,port=16000] 
> procedure2.ProcedureExecutor: Stored MergeTableRegionsProcedure 
> (table=IntegrationTestBigLinkedList 
> regions=[IntegrationTestBigLinkedList,,1485292220242.4c5ea240e86ef22ec7264b1153dd557d.,
>  
> IntegrationTestBigLinkedList,\x0E8\xE3\x8E8\xE3\x8E8,1485292220242.6cdb98dfed41ea689b3cd66478c2c580.
>  ] forcible=false), procId=12, owner=hbase, 
> state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE
> 2017-01-24 21:29:00,108 DEBUG [ProcedureExecutorWorker-14] 
> wal.WALProcedureStore: Set running procedure count=1, slots=24
> 2017-01-24 21:29:00,127 ERROR [ProcedureExecutorWorker-14] 
> procedure2.ProcedureExecutor: CODE-BUG: Uncatched runtime exception for 
> procedure: MergeTableRegionsProcedure (table=IntegrationTestBigLinkedList 
> regions=[IntegrationTestBigLinkedList,,1485292220242.4c5ea240e86ef22ec7264b1153dd557d.,
>  
> IntegrationTestBigLinkedList,\x0E8\xE3\x8E8\xE3\x8E8,1485292220242.6cdb98dfed41ea689b3cd66478c2c580.
>  ] forcible=false), procId=12, owner=hbase, 
> state=RUNNABLE:MERGE_TABLE_REGIONS_MOVE_REGION_TO_SAME_RS
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:1024)
> at 
> org.apache.hadoop.hbase.nio.MultiByteBuff.get(MultiByteBuff.java:628)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$ByteBuffByteInput.read(RpcServer.java:1483)
> at 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.ByteInputByteString.copyToInternal(ByteInputByteString.java:105)
> at 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.ByteString.toByteArray(ByteString.java:651)
> at org.apache.hadoop.hbase.RegionLoad.getName(RegionLoad.java:50)
> at 
> org.apache.hadoop.hbase.ServerLoad.getRegionsLoad(ServerLoad.java:236)
> at 
> org.apache.hadoop.hbase.master.procedure.MergeTableRegionsProcedure.getRegionLoad(MergeTableRegionsProcedure.java:774)
> at 
> org.apache.hadoop.hbase.master.procedure.MergeTableRegionsProcedure.MoveRegionsToSameRS(MergeTableRegionsProcedure.java:461)
> at 
> org.apache.hadoop.hbase.master.procedure.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:142)
> at 
> org.apache.hadoop.hbase.master.procedure.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:72)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:154)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:708)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1332)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1133)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:76)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1588)
> {code}
> Master log to be attached.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17595) Add partial result support for small/limited scan

2017-03-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938614#comment-15938614
 ] 

Hudson commented on HBASE-17595:


FAILURE: Integrated in Jenkins build HBase-1.4 #679 (See 
[https://builds.apache.org/job/HBase-1.4/679/])
HBASE-17595 addendum fix the problem for mayHaveMoreCellsInRow (zhangduo: rev 
849ab5ff2998192d4f21d49f8356cc9a4370743a)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScanResultCache.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestLimitedScanWithFilter.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AllowPartialScanResultCache.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/BatchScanResultCache.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScannerContext.java
* (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/CompleteScanResultCache.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/ColumnCountOnRowFilter.java


> Add partial result support for small/limited scan
> -
>
> Key: HBASE-17595
> URL: https://issues.apache.org/jira/browse/HBASE-17595
> Project: HBase
>  Issue Type: Sub-task
>  Components: asyncclient, Client, scan
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17595-addendum.patch, 
> HBASE-17595-addendum-v1.patch, HBASE-17595-addendum-v2.patch, 
> HBASE-17595-addendum-v3.patch, HBASE-17595-branch-1-addendum.patch, 
> HBASE-17595-branch-1.patch, HBASE-17595.patch, HBASE-17595-v1.patch
>
>
> The partial result support is marked as a 'TODO' when implementing 
> HBASE-17045. And when implementing HBASE-17508, we found that if we make 
> small scan share the same logic with general scan, the scan request other 
> than open scanner will not have the small flag so the server may return  
> partial result to the client and cause some strange behavior. It is solved by 
> modifying the logic at server side, but this means the 1.4.x client is not 
> safe to contact with earlier 1.x server. So we'd better address the problem 
> at client side. Marked as blocker as this issue should be finished before any 
> 2.x and 1.4.x releases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14614) Procedure v2: Core Assignment Manager

2017-03-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938591#comment-15938591
 ] 

stack commented on HBASE-14614:
---

Let me make a branch for this dev. The patch is too big now and still not 
finished.

> Procedure v2: Core Assignment Manager
> -
>
> Key: HBASE-14614
> URL: https://issues.apache.org/jira/browse/HBASE-14614
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: Stephen Yuan Jiang
>Assignee: Matteo Bertozzi
> Fix For: 2.0.0
>
> Attachments: HBASE-14614.master.001.patch, 
> HBASE-14614.master.002.patch, HBASE-14614.master.003.patch, 
> HBASE-14614.master.004.patch, HBASE-14614.master.005.patch, 
> HBASE-14614.master.006.patch, HBASE-14614.master.007.patch, 
> HBASE-14614.master.008.patch, HBASE-14614.master.009.patch, 
> HBASE-14614.master.010.patch, HBASE-14614.master.011.patch, 
> HBASE-14614.master.012.patch, HBASE-14614.master.012.patch, 
> HBASE-14614.master.013.patch, HBASE-14614.master.014.patch, 
> HBASE-14614.master.015.patch, HBASE-14614.master.016.patch
>
>
> New AssignmentManager implemented using proc-v2.
>  - AssignProcedure handle assignment operation
>  - UnassignProcedure handle unassign operation
>  - MoveRegionProcedure handle move/balance operation
> Concurrent Assign operations are batched together and sent to the balancer
> Concurrent Assign and Unassign operation ready to be sent to the RS are 
> batched together
> This patch is an intermediate state where we add the new AM as 
> AssignmentManager2() to the master, to be reached by tests. but the new AM 
> will not be integrated with the rest of the system. Only new am unit-tests 
> will exercise the new assigment manager. The integration with the master code 
> is part of HBASE-14616



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-03-23 Thread Kahlil Oppenheimer (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938578#comment-15938578
 ] 

Kahlil Oppenheimer edited comment on HBASE-17707 at 3/23/17 3:34 PM:
-

bq. We cannot maintain two different cost functions for table skew. Let's 
remove the old one from the code, and only have the new implementation in this 
patch. We cannot have dead code lying around and rot. We can close HBASE-17706 
as won't fix.
I will add the removal of this old cost function to my patch.

bq. The new candidate generator TableSkewCandidateGenerator is not added to the 
SLB::candidateGenerators field which means that it is not used? I can only see 
the test using it. Is this intended? It has to be enabled by default.
Good catch on the table skew candidate generator. I will also add that to the 
patch as well. I was originally going to do it in a separate patch, but it 
makes much more sense to just do it here.

bq. Did you intend to use the raw variable here instead of calling scale again:
Yup! Let's call R the range [0, 1]. We know that scale() maps values into R. We 
also know that sqrt() maps values from R -> R. Lastly, we know that .9 * r + .1 
for any r in R yields another value in R. So can be sure the outcome is in R. 
No need to call scale function :).

Before opening the patch, I'm just repeatedly running the tests 100s of times 
to feel more confident I haven't missed edge cases since a lot of these test 
failures are very non-deterministic.


was (Author: kahliloppenheimer):
bq. We cannot maintain two different cost functions for table skew. Let's 
remove the old one from the code, and only have the new implementation in this 
patch. We cannot have dead code lying around and rot. We can close HBASE-17706 
as won't fix.
I will add the removal of this old cost function to my patch.

bq. The new candidate generator TableSkewCandidateGenerator is not added to the 
SLB::candidateGenerators field which means that it is not used? I can only see 
the test using it. Is this intended? It has to be enabled by default.
Good catch on the table skew candidate generator. I will also add that to the 
patch as well. I was originally going to do it in a separate patch, but it 
makes much more sense to just do it here.

bq. Did you intend to use the raw variable here instead of calling scale again:
Yup! Let's call R the range [0, 1]. We know that scale() maps values into R. We 
also know that sqrt() maps values from R -> R. Lastly, we know that .9 * r + .1 
for any r in R yields another value in R. So can be sure the outcome is in R. 
No need to call scale function :).



> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it s

[jira] [Commented] (HBASE-17707) New More Accurate Table Skew cost function/generator

2017-03-23 Thread Kahlil Oppenheimer (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938578#comment-15938578
 ] 

Kahlil Oppenheimer commented on HBASE-17707:


bq. We cannot maintain two different cost functions for table skew. Let's 
remove the old one from the code, and only have the new implementation in this 
patch. We cannot have dead code lying around and rot. We can close HBASE-17706 
as won't fix.
I will add the removal of this old cost function to my patch.

bq. The new candidate generator TableSkewCandidateGenerator is not added to the 
SLB::candidateGenerators field which means that it is not used? I can only see 
the test using it. Is this intended? It has to be enabled by default.
Good catch on the table skew candidate generator. I will also add that to the 
patch as well. I was originally going to do it in a separate patch, but it 
makes much more sense to just do it here.

bq. Did you intend to use the raw variable here instead of calling scale again:
Yup! Let's call R the range [0, 1]. We know that scale() maps values into R. We 
also know that sqrt() maps values from R -> R. Lastly, we know that .9 * r + .1 
for any r in R yields another value in R. So can be sure the outcome is in R. 
No need to call scale function :).



> New More Accurate Table Skew cost function/generator
> 
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>Priority: Minor
> Fix For: 2.0
>
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch, 
> HBASE-17707-02.patch, HBASE-17707-03.patch, HBASE-17707-04.patch, 
> HBASE-17707-05.patch, HBASE-17707-06.patch, HBASE-17707-07.patch, 
> HBASE-17707-08.patch, HBASE-17707-09.patch, HBASE-17707-11.patch, 
> HBASE-17707-11.patch, HBASE-17707-12.patch, test-balancer2-13617.out
>
>
> This patch includes new version of the TableSkewCostFunction and a new 
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal 
> number of region moves required for a given table to perfectly balance the 
> table across the cluster (i.e. as if the regions from that table had been 
> round-robin-ed across the cluster). This number of moves is computer for each 
> table, then normalized to a score between 0-1 by dividing by the number of 
> moves required in the absolute worst case (i.e. the entire table is stored on 
> one server), and stored in an array. The cost function then takes a weighted 
> average of the average and maximum value across all tables. The weights in 
> this average are configurable to allow for certain users to more strongly 
> penalize situations where one table is skewed versus where every table is a 
> little bit skewed. To better spread this value more evenly across the range 
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize 
> the above TableSkewCostFunction. It first simply tries to move regions until 
> each server has the right number of regions, then it swaps regions around 
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with 
> 100s of TBs of data and 100s of tables across dozens of servers and found 
> both to be very performant and accurate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-17287:
---
Attachment: 17287.v2.txt

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
> Attachments: 17287.v2.txt
>
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938534#comment-15938534
 ] 

Ted Yu edited comment on HBASE-17287 at 3/23/17 3:30 PM:
-

In getLogDirs(), when we detect closed filesystem, abort the master.

Comment is welcome.


was (Author: yuzhih...@gmail.com):
Tentative patch.

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
> Attachments: 17287.v2.txt
>
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-17287:
---
Attachment: (was: 17287.v1.txt)

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
> Attachments: 17287.v2.txt
>
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17816) HRegion#mutateRowWithLocks should update writeRequestCount metric

2017-03-23 Thread Weizhan Zeng (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938547#comment-15938547
 ] 

Weizhan Zeng commented on HBASE-17816:
--

   {code:none}

  MultiRowMutationProcessor proc = new MultiRowMutationProcessor(mutations, 
rowsToLock);
 processRowsWithLocks(proc, -1, nonceGroup, nonce);
+writeRequestsCount.add(mutations.size());
{code} 
[~ashu210890] I think we should add requests after process. How do you think?

> HRegion#mutateRowWithLocks should update writeRequestCount metric
> -
>
> Key: HBASE-17816
> URL: https://issues.apache.org/jira/browse/HBASE-17816
> Project: HBase
>  Issue Type: Bug
>Reporter: Ashu Pachauri
>Assignee: Ashu Pachauri
> Attachments: HBASE-17816.master.001.patch
>
>
> Currently, all the calls that use HRegion#mutateRowWithLocks miss 
> writeRequestCount metric. The mutateRowWithLocks base method should update 
> the metric.
> Examples are checkAndMutate calls through RSRpcServices#multi, 
> Region#mutateRow api , MultiRowMutationProcessor coprocessor endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-17287:
---
Attachment: 17287.v1.txt

Tentative patch.

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
> Attachments: 17287.v1.txt
>
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-17287) Master becomes a zombie if filesystem object closes

2017-03-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-17287:
--

Assignee: Ted Yu

> Master becomes a zombie if filesystem object closes
> ---
>
> Key: HBASE-17287
> URL: https://issues.apache.org/jira/browse/HBASE-17287
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Clay B.
>Assignee: Ted Yu
>
> We have seen an issue whereby if the HDFS is unstable and the HBase master's 
> HDFS client is unable to stabilize before 
> {{dfs.client.failover.max.attempts}} then the master's filesystem object 
> closes. This seems to result in an HBase master which will continue to run 
> (process and znode exists) but no meaningful work can be done (e.g. assigning 
> meta).What we saw in our HBase master logs was:{code}2016-12-01 19:19:08,192 
> ERROR org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: 
> Caught M_META_SERVER_SHUTDOWN, count=1java.io.IOException: failed log 
> splitting for cluster-r5n12.bloomberg.com,60200,1480632863218, will retryat 
> org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:84)at
>  org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at
>  java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: 
> Filesystem closed{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17820) Fail build with hadoop-2.6.0

2017-03-23 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938461#comment-15938461
 ] 

Sean Busbey commented on HBASE-17820:
-

excellent! I apologize in advance; handling licensing errors are still a bit 
rough.

First, you should switch to the master branch and see if building with Hadoop 
2.6.0 fails there as well. If it doesn't, then check branch-1, then branch-1.3, 
etc.

Next, to find what's causing the problem:

# Run the build, e.g. {{mvn clean install -Dhadoop-two.version=2.6.0 
-DskipTests}} (or {{assembly:single}} or whatever)
# Examine the output for the module that failed, e.g. hbase-assembly in the 
description of this jira
# Open the failed generated license file for that module ({{find 
%module%/target -name LICENSE}}, e.g. {{find hbase-assembly/target -name 
LICENSE}})
# The very end of that file should be the entry for the dependency that didn't 
have proper license information.

The next step will depend on what specific dependency is showing up. You should

* Make sure that in the maven log output there wasn't a warning about being 
unable to get a pom for the dependency named
* Make sure you haven't cached in your local maven repository an old / invalid 
pom for the dependency (the easiest way to do this is to delete the dependency 
from your local maven repository)

If the problem persists after that, you need to know what kind of error you 
have. If the license is not one of our whitelisted licenses, there'll be a note 
about "make sure this license is fine to make use of". Sometimes the license 
will just be missing.

 I can answer questions about the former issue once you have a particular 
license identified. If the license is missing, there's a heuristic for figuring 
out what it ought to be; I can walk through that if it turns out to be the 
problem.

> Fail build with hadoop-2.6.0
> 
>
> Key: HBASE-17820
> URL: https://issues.apache.org/jira/browse/HBASE-17820
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 1.2.4
> Environment: hadoop-2.6.0, java 8
>Reporter: Reid Chan
>Assignee: Reid Chan
>
> I used this command "mvn clean install -Dhadoop-two.version=2.6.0 
> -DskipTests" to build hbase-1.2.4 source code. 
> Build failed at hbase-assembly module.
> This is the fail message: 
> "Failed to execute goal 
> org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (default) 
> on project hbase-assembly: Error rendering velocity resource. Error invoking 
> method 'get(java.lang.Integer)' in java.util.ArrayList at 
> META-INF/LICENSE.vm[line 1671, column 8]: InvocationTargetException: Index: 
> 0, Size: 0 -> [Help 1]".



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-17820) Fail build with hadoop-2.6.0

2017-03-23 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey reassigned HBASE-17820:
---

Assignee: Reid Chan

> Fail build with hadoop-2.6.0
> 
>
> Key: HBASE-17820
> URL: https://issues.apache.org/jira/browse/HBASE-17820
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 1.2.4
> Environment: hadoop-2.6.0, java 8
>Reporter: Reid Chan
>Assignee: Reid Chan
>
> I used this command "mvn clean install -Dhadoop-two.version=2.6.0 
> -DskipTests" to build hbase-1.2.4 source code. 
> Build failed at hbase-assembly module.
> This is the fail message: 
> "Failed to execute goal 
> org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (default) 
> on project hbase-assembly: Error rendering velocity resource. Error invoking 
> method 'get(java.lang.Integer)' in java.util.ArrayList at 
> META-INF/LICENSE.vm[line 1671, column 8]: InvocationTargetException: Index: 
> 0, Size: 0 -> [Help 1]".



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >