[jira] [Commented] (HBASE-20411) Ameliorate MutableSegment synchronize

2018-07-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548802#comment-16548802
 ] 

stack commented on HBASE-20411:
---

MemStoreSize is 
@InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.COPROC) By rights, I 
should not have changed the visibility. Want me to restore [~carp84]?



> Ameliorate MutableSegment synchronize
> -
>
> Key: HBASE-20411
> URL: https://issues.apache.org/jira/browse/HBASE-20411
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: huaxiang sun
>Priority: Major
> Fix For: 2.0.1
>
> Attachments: 2.load.patched.17704.lock.svg, 
> 2.load.patched.2.17704.lock.svg, 2.more.patch.12010.lock.svg, 
> 2.pe.write.32026.lock.svg, 2.pe.write.ameliorate.106553.lock.svg, 
> 41901.lock.svg, HBASE-20411-atomiclong-longadder.patch, 
> HBASE-20411.branch-2.0.001.patch, HBASE-20411.branch-2.0.002.patch, 
> HBASE-20411.branch-2.0.003.patch, HBASE-20411.branch-2.0.004.patch, 
> HBASE-20411.branch-2.0.005.patch, HBASE-20411.branch-2.0.006.patch, 
> HBASE-20411.branch-2.0.007.patch, HBASE-20411.branch-2.0.008.patch, 
> HBASE-20411.branch-2.0.009.patch, HBASE-20411.branch-2.0.010.patch, 
> HBASE-20411.branch-2.0.011.patch, HBASE-20411.branch-2.0.012.patch, 
> HBASE-20411.branch-2.0.013.patch, compatibility-matrix.png, 
> jmc6.write_time_locks.png
>
>
> This item is migrated from HBASE-20236 so it gets dedicated issue.
> Let me upload evidence that has this synchronize as a stake in our write-time 
> perf. I'll migrate the patch I posted with updates that come of comments 
> posted by [~mdrob] on the HBASE-20236 issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20823) Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest

2018-07-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548794#comment-16548794
 ] 

stack commented on HBASE-20823:
---

[~reidchan] Looks like it was done. Shout if you need anything else sir.

> Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest
> --
>
> Key: HBASE-20823
> URL: https://issues.apache.org/jira/browse/HBASE-20823
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Assignee: Jack Bearden
>Priority: Trivial
>  Labels: beginner
> Attachments: HBASE-20823.001.patch
>
>
> {code}
> /**
>* Builds a {@link RegionSpaceUseReportRequest} protobuf message from the 
> region size map.
>*
>* @param regionSizeStore The size in bytes of regions
>* @return The corresponding protocol buffer message.
>*/
>   RegionSpaceUseReportRequest 
> buildRegionSpaceUseReportRequest(RegionSizeStore regionSizes)
> {code}
> {{@param regionSizeStore}}, but actual name is regionSizes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548792#comment-16548792
 ] 

stack commented on HBASE-20846:
---

On the patch...

Non-change in AbstractProcedureScheduler.java ?

Most of patch content is swapping Procedure for Procedure.

Trying to understand... now Procedure has a *locked* data member that can be 
used by subprocedures. Before subprocedures would supply their own boolean... 
sometimes it was named lock... other times locked. A subprocedure that wants a 
lock for the life of the procedure returns holdLock as per before? If a 
procedure wants to be locked for the procedure step, they do same as before 
too? They just call up to the super class to manage the setting an unsetting of 
locked?

This lockedWhenLoading is not just for procedures that holdLock for life of the 
Procedure run it seems. We'll restore the lock even if we crashed mid-procedure 
step on load? That seems right/expected.

920   // persist that we have held the lock. This must be done before 
we actually execute the
921   // procedure, otherwise when restarting, we may consider the 
procedure does not have a lock,
922   // but it may have already done some changes as we have already 
executed it, and if another
923   // procedure gets the lock, then the semantic will be broken if 
the holdLock is true, as we do
924   // not expect that another procedure can be executed in the 
middle.
925   store.update(this);

Now we do double-store to ProcedureWALStore? Once for the lock and then once 
for end-state of the Procedure step? Each time? Slows down pv2 throughput? Any 
chance of  a reordering so we don't release lock until after we've persisted 
step and lock state?

Is this right?

190* The {@link #doAcquireLock(Object, ProcedureStore)} will be split 
into two steps, first, it will
191* call us to determine whether we need to wait for initialization, 
second, it will call
192* {@link #doAcquireLock(Object, ProcedureStore)} to actually handle 
the lock for this procedure.

Seems to say doAcquireLock calls itself.


You remove @InterfaceAudience.Private on methods because class is 
@InterfaceAudience.Private? Could remove the Evolving from the class too... 
Evolving and Private don't make sense together. Nit.

Nice translation: 94return 
subprocs.stream().mapToLong(Procedure::getProcId).toArray();

These will become annoying?

  LOG.debug("{} didn't hold the lock before restarting, skip acquiring 
lock.", this);

You probably need them for the moment debugging though? Could leave them in and 
then strip them later when known working?

We release the lock even if we didn't have it? And maybe update store if 
though we didn't have lock.

933   final void doReleaseLock(TEnvironment env, ProcedureStore store) {
934 locked = false;

Will be back with more. Its taking time to digest. Thanks.

This will change how we operate post-master crash but it should make stuff way 
better, easier to stand at least, and more predictable.



> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive 

[jira] [Updated] (HBASE-20823) Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest

2018-07-18 Thread Jack Bearden (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Bearden updated HBASE-20823:
-
Attachment: HBASE-20823.001.patch
Status: Patch Available  (was: Open)

> Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest
> --
>
> Key: HBASE-20823
> URL: https://issues.apache.org/jira/browse/HBASE-20823
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Assignee: Jack Bearden
>Priority: Trivial
>  Labels: beginner
> Attachments: HBASE-20823.001.patch
>
>
> {code}
> /**
>* Builds a {@link RegionSpaceUseReportRequest} protobuf message from the 
> region size map.
>*
>* @param regionSizeStore The size in bytes of regions
>* @return The corresponding protocol buffer message.
>*/
>   RegionSpaceUseReportRequest 
> buildRegionSpaceUseReportRequest(RegionSizeStore regionSizes)
> {code}
> {{@param regionSizeStore}}, but actual name is regionSizes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548787#comment-16548787
 ] 

Hadoop QA commented on HBASE-20846:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
36s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} hbase-procedure: The patch generated 0 new + 28 
unchanged - 16 fixed = 28 total (was 44) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} hbase-server: The patch generated 0 new + 316 
unchanged - 7 fixed = 316 total (was 323) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
18s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m  4s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
53s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}174m 20s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}237m 

[jira] [Assigned] (HBASE-20823) Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest

2018-07-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-20823:
--

Assignee: Jack Bearden

> Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest
> --
>
> Key: HBASE-20823
> URL: https://issues.apache.org/jira/browse/HBASE-20823
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Assignee: Jack Bearden
>Priority: Trivial
>  Labels: beginner
>
> {code}
> /**
>* Builds a {@link RegionSpaceUseReportRequest} protobuf message from the 
> region size map.
>*
>* @param regionSizeStore The size in bytes of regions
>* @return The corresponding protocol buffer message.
>*/
>   RegionSpaceUseReportRequest 
> buildRegionSpaceUseReportRequest(RegionSizeStore regionSizes)
> {code}
> {{@param regionSizeStore}}, but actual name is regionSizes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20565) ColumnRangeFilter combined with ColumnPaginationFilter can produce incorrect result since 1.4

2018-07-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548776#comment-16548776
 ] 

Zheng Hu commented on HBASE-20565:
--

[~reidchan]  This patch do not fix the problem in HBASE-20151... it's seems 
another diff problem.  Will dig HBASE-20151 later .. 

> ColumnRangeFilter combined with ColumnPaginationFilter can produce incorrect 
> result since 1.4
> -
>
> Key: HBASE-20565
> URL: https://issues.apache.org/jira/browse/HBASE-20565
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 2.1.0, 1.4.4, 2.0.0
>Reporter: Jerry He
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: HBASE-20565.v1.patch, HBASE-20565.v2.patch, debug.diff, 
> debug.log, test-branch-1.4.patch
>
>
> When ColumnPaginationFilter is combined with ColumnRangeFilter, we may see 
> incorrect result.
> Here is a simple example.
> One row with 10 columns c0, c1, c2, .., c9.  I have a ColumnRangeFilter for 
> range c2 to c9.  Then I have a ColumnPaginationFilter with limit 5 and offset 
> 0.  FileterList is FilterList(Operator.MUST_PASS_ALL, ColumnRangeFilter, 
> ColumnPaginationFilter).
> We expect 5 columns being returned.  But in HBase 1.4 and after, 4 columns 
> are returned.
> In 1.2.x, the correct 5 columns are returned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20907) Fix Intermittent failure on TestProcedurePriority

2018-07-18 Thread Yu Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-20907:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed into master, branch-2, branch-2.0 and branch-2.1. Thanks 
[~yuzhih...@gmail.com] for review.

> Fix Intermittent failure on TestProcedurePriority
> -
>
> Key: HBASE-20907
> URL: https://issues.apache.org/jira/browse/HBASE-20907
> Project: HBase
>  Issue Type: Test
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20907.patch, HBASE-20907.v2.patch
>
>
> From a local UT check against 2.1.0-RC1, HMaster failed to initialize before 
> time out. Checking the test log we could see below message:
> {noformat}
> 2018-07-17 20:06:37,142 DEBUG [Thread-4003] 
> client.RpcRetryingCallerImpl(131): Call exception, tries=6, retries=6, 
> started=4173 ms ago, cancelled=false, msg=java.io.IOException: Inject error
> at 
> org.apache.hadoop.hbase.master.procedure.TestProcedurePriority$MyCP.preGetOp(TestProcedurePriority.java:92)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:841)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:838)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preGet(RegionCoprocessorHost.java:838)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2520)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2460)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> , details=row 'hbase:namespace' on table 'hbase:meta' at 
> region=hbase:meta,,1.1588230740, 
> hostname=hdpdevm1.et2sqa.tbsite.net,59254,1531829189215, seqNum=-1, 
> exception=java.io.IOException: java.io.IOException: Inject error
> at 
> org.apache.hadoop.hbase.master.procedure.TestProcedurePriority$MyCP.preGetOp(TestProcedurePriority.java:92)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:841)
> ...
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:386)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:360)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getTableState(MetaTableAccessor.java:1078)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:403)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:94)
> {noformat}
> In current test code we will set {{FAIL}} to true w/o checking whether 
> namespace manager is already up, and if not lucky we will run into the above 
> case and get a timeout.
> The fix will be quite straight forward.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20823) Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest

2018-07-18 Thread Jack Bearden (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548772#comment-16548772
 ] 

Jack Bearden commented on HBASE-20823:
--

Ok, thanks [~reidchan]. I sent a message [to 
d...@hbase.apache.org|mailto:to%c2%a0...@hbase.apache.org]. I'll upload the 
patch as soon as I get elevated

> Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest
> --
>
> Key: HBASE-20823
> URL: https://issues.apache.org/jira/browse/HBASE-20823
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Priority: Trivial
>  Labels: beginner
>
> {code}
> /**
>* Builds a {@link RegionSpaceUseReportRequest} protobuf message from the 
> region size map.
>*
>* @param regionSizeStore The size in bytes of regions
>* @return The corresponding protocol buffer message.
>*/
>   RegionSpaceUseReportRequest 
> buildRegionSpaceUseReportRequest(RegionSizeStore regionSizes)
> {code}
> {{@param regionSizeStore}}, but actual name is regionSizes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue

2018-07-18 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548762#comment-16548762
 ] 

Jingyun Tian commented on HBASE-20855:
--

[~yuzhih...@gmail.com] Thx, Ted. These problems are addressed.

> PeerConfigTracker only support one listener will cause problem when there is 
> a recovered replication queue
> --
>
> Key: HBASE-20855
> URL: https://issues.apache.org/jira/browse/HBASE-20855
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 1.5.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-20855.branch-1.001.patch, 
> HBASE-20855.branch-1.002.patch, HBASE-20855.branch-1.003.patch, 
> HBASE-20855.branch-1.004.patch, HBASE-20855.branch-1.005.patch, 
> HBASE-20855.branch-1.006.patch, HBASE-20855.branch-1.007.patch
>
>
> {code}
> public void init(Context context) throws IOException {
>  this.ctx = context;
>  if (this.ctx != null){
>  ReplicationPeer peer = this.ctx.getReplicationPeer();
>  if (peer != null){
>  peer.trackPeerConfigChanges(this);
>  } else {
>  LOG.warn("Not tracking replication peer config changes for Peer Id " + 
> this.ctx.getPeerId() +
>  " because there's no such peer");
>  }
>  }
> }
> {code}
> As we know, replication source will set itself to the PeerConfigTracker in 
> ReplicationPeer. When there is one or more recovered queue, each queue will 
> generate a new replication source, But they share the same ReplicationPeer. 
> Then when it calls setListener, the new generated one will cover the older 
> one. Thus there will only has one ReplicationPeer that receive the peer 
> config change notify.
> {code}
> public synchronized void setListener(ReplicationPeerConfigListener listener){
>  this.listener = listener;
> }
> {code}
>  
> To solve this,  PeerConfigTracker need to support multiple listener and 
> listener should be removed when the replication endpoint terminated.
> I will upload a patch later with fix and UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue

2018-07-18 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-20855:
-
Attachment: HBASE-20855.branch-1.007.patch

> PeerConfigTracker only support one listener will cause problem when there is 
> a recovered replication queue
> --
>
> Key: HBASE-20855
> URL: https://issues.apache.org/jira/browse/HBASE-20855
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 1.5.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-20855.branch-1.001.patch, 
> HBASE-20855.branch-1.002.patch, HBASE-20855.branch-1.003.patch, 
> HBASE-20855.branch-1.004.patch, HBASE-20855.branch-1.005.patch, 
> HBASE-20855.branch-1.006.patch, HBASE-20855.branch-1.007.patch
>
>
> {code}
> public void init(Context context) throws IOException {
>  this.ctx = context;
>  if (this.ctx != null){
>  ReplicationPeer peer = this.ctx.getReplicationPeer();
>  if (peer != null){
>  peer.trackPeerConfigChanges(this);
>  } else {
>  LOG.warn("Not tracking replication peer config changes for Peer Id " + 
> this.ctx.getPeerId() +
>  " because there's no such peer");
>  }
>  }
> }
> {code}
> As we know, replication source will set itself to the PeerConfigTracker in 
> ReplicationPeer. When there is one or more recovered queue, each queue will 
> generate a new replication source, But they share the same ReplicationPeer. 
> Then when it calls setListener, the new generated one will cover the older 
> one. Thus there will only has one ReplicationPeer that receive the peer 
> config change notify.
> {code}
> public synchronized void setListener(ReplicationPeerConfigListener listener){
>  this.listener = listener;
> }
> {code}
>  
> To solve this,  PeerConfigTracker need to support multiple listener and 
> listener should be removed when the replication endpoint terminated.
> I will upload a patch later with fix and UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20823) Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest

2018-07-18 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548758#comment-16548758
 ] 

Reid Chan commented on HBASE-20823:
---

ping [~stack], can you help grant [~jackbearden] contributor.

> Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest
> --
>
> Key: HBASE-20823
> URL: https://issues.apache.org/jira/browse/HBASE-20823
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Priority: Trivial
>  Labels: beginner
>
> {code}
> /**
>* Builds a {@link RegionSpaceUseReportRequest} protobuf message from the 
> region size map.
>*
>* @param regionSizeStore The size in bytes of regions
>* @return The corresponding protocol buffer message.
>*/
>   RegionSpaceUseReportRequest 
> buildRegionSpaceUseReportRequest(RegionSizeStore regionSizes)
> {code}
> {{@param regionSizeStore}}, but actual name is regionSizes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20401) Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548757#comment-16548757
 ] 

Hadoop QA commented on HBASE-20401:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
37s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
42s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 13s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}143m 
41s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}187m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20401 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932161/HBASE-20401.master.004.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 6b879fd5b882 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8c85763327 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13680/testReport/ |
| Max. process+thread count | 4074 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13680/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Make `MAX_WAIT` and 

[jira] [Updated] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing

2018-07-18 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20875:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for reviews. Pushed ot branch-2.0+.

> MemStoreLABImp::copyIntoCell uses 7% CPU when writing
> -
>
> Key: HBASE-20875
> URL: https://issues.apache.org/jira/browse/HBASE-20875
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.1
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: 
> 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, 
> 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, 
> HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot 
> 2018-07-11 at 9.52.46 PM.png
>
>
> Looks like this with a PE random write loading:
> {code}
>  ./hbase/bin/hbase  --config ~/conf_hbase 
> org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40  
> --size=30 --columns=10 --valueSize=100 randomWrite 200
> {code}
> ... against a single server.
> {code}
> 12.47%  perf-91935.map
> [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare
>  10.42%  libjvm.so
>  [.] 
> ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*,
>  oopDesc*, unsigned long, markOopDesc*)
>   6.78%  perf-91935.map   
>  [.] 
> Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto
> 
> {code}
> These are top CPU consumers using perf-map-agent ./bin/perf-java-top... 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing

2018-07-18 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20875:
--
Fix Version/s: 2.1.1
   2.2.0
   3.0.0

> MemStoreLABImp::copyIntoCell uses 7% CPU when writing
> -
>
> Key: HBASE-20875
> URL: https://issues.apache.org/jira/browse/HBASE-20875
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Affects Versions: 2.0.1
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: 
> 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, 
> 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, 
> HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot 
> 2018-07-11 at 9.52.46 PM.png
>
>
> Looks like this with a PE random write loading:
> {code}
>  ./hbase/bin/hbase  --config ~/conf_hbase 
> org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40  
> --size=30 --columns=10 --valueSize=100 randomWrite 200
> {code}
> ... against a single server.
> {code}
> 12.47%  perf-91935.map
> [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare
>  10.42%  libjvm.so
>  [.] 
> ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*,
>  oopDesc*, unsigned long, markOopDesc*)
>   6.78%  perf-91935.map   
>  [.] 
> Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto
> 
> {code}
> These are top CPU consumers using perf-map-agent ./bin/perf-java-top... 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-18 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548729#comment-16548729
 ] 

Mike Drob commented on HBASE-20894:
---

There are specific security issues with Java Object Serialization that are not 
present in protobuf, thrift, kryo, avro, etc... There may be other issues 
present in those libraries, so details matter. The link I provided has specific 
examples for how ObjectInputStream can behave badly.

I also do not think this is a significant security issue - if it were, then it 
would have been handled as a private jira with an associated CVE and following 
the standard ASF security practices.

Security auditors like to run scans and check checkboxes. This is something 
that can come up in one of those scans. As the code exists right now, setting 
file system permissions is completely sufficient, but security issues have a 
nasty way of coming up exactly when you aren't looking for them and I'd be hard 
pressed to prove that this couldn't be used as a part of some larger privilege 
escalation in the future. So, we can "fix" it by using a different ser/de 
technique.

I'm still having trouble discerning the severity of your stance on this issue. 
Would you be a "-1 don't do this, it's actively harmful" or "-0 I think it's a 
waste of time but I'm not going to stop you" if you had to commit to a 
position? Or something else entirely?

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20460) Doc offheap write-path

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548725#comment-16548725
 ] 

Hadoop QA commented on HBASE-20460:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
37s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  5m  
1s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  4m 
54s{color} | {color:blue} patch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}253m 
11s{color} | {color:green} root in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
55s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}280m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20460 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932148/HBASE-20460.001.patch 
|
| Optional Tests |  asflicense  javac  javadoc  unit  refguide  xml  |
| uname | Linux f63fc23c3481 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8c85763327 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13678/artifact/patchprocess/branch-site/book.html
 |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13678/artifact/patchprocess/patch-site/book.html
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13678/testReport/ |
| Max. process+thread count | 4944 (vs. ulimit of 1) |
| modules | C: hbase-common . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13678/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Doc offheap write-path
> --
>
> Key: HBASE-20460
> URL: https://issues.apache.org/jira/browse/HBASE-20460
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, Offheaping
>Reporter: stack
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HBASE-20460.001.patch
>
>
> We have an empty section in refguide that needs 

[jira] [Commented] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue

2018-07-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548723#comment-16548723
 ] 

Ted Yu commented on HBASE-20855:


{code}
+  public PeerConfigTracker getPeerConfigTracker() {

+public Set getListeners(){
{code}
These methods can be package private.
{code}
+public synchronized void removeListener(ReplicationPeerConfigListener 
removeListener) {
{code}
removeListener -> listenerToRemove

After the above is addressed, should be good to go.



> PeerConfigTracker only support one listener will cause problem when there is 
> a recovered replication queue
> --
>
> Key: HBASE-20855
> URL: https://issues.apache.org/jira/browse/HBASE-20855
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 1.5.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-20855.branch-1.001.patch, 
> HBASE-20855.branch-1.002.patch, HBASE-20855.branch-1.003.patch, 
> HBASE-20855.branch-1.004.patch, HBASE-20855.branch-1.005.patch, 
> HBASE-20855.branch-1.006.patch
>
>
> {code}
> public void init(Context context) throws IOException {
>  this.ctx = context;
>  if (this.ctx != null){
>  ReplicationPeer peer = this.ctx.getReplicationPeer();
>  if (peer != null){
>  peer.trackPeerConfigChanges(this);
>  } else {
>  LOG.warn("Not tracking replication peer config changes for Peer Id " + 
> this.ctx.getPeerId() +
>  " because there's no such peer");
>  }
>  }
> }
> {code}
> As we know, replication source will set itself to the PeerConfigTracker in 
> ReplicationPeer. When there is one or more recovered queue, each queue will 
> generate a new replication source, But they share the same ReplicationPeer. 
> Then when it calls setListener, the new generated one will cover the older 
> one. Thus there will only has one ReplicationPeer that receive the peer 
> config change notify.
> {code}
> public synchronized void setListener(ReplicationPeerConfigListener listener){
>  this.listener = listener;
> }
> {code}
>  
> To solve this,  PeerConfigTracker need to support multiple listener and 
> listener should be removed when the replication endpoint terminated.
> I will upload a patch later with fix and UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20873) Update doc for Endpoint-based Export

2018-07-18 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548722#comment-16548722
 ] 

Mingliang Liu commented on HBASE-20873:
---

+1 (non-binding). {{Export::checkDir()}} does check the existence of output 
directory. 

{quote}Could we highlight it by the [NOTE] block?{quote}
You mean {{NOTE:}}. Nice.

> Update doc for Endpoint-based Export
> 
>
> Key: HBASE-20873
> URL: https://issues.apache.org/jira/browse/HBASE-20873
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Attachments: HBASE-20873.master.001.patch
>
>
> The current documentation on the usage is a little vague. I'd like to take a 
> stab at expanding it, based on my experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20876) Improve docs style in HConstants

2018-07-18 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548721#comment-16548721
 ] 

Mike Drob commented on HBASE-20876:
---

[~jojochuang] - You can also use the dev-support/submit-patch helper script!

> Improve docs style in HConstants
> 
>
> Key: HBASE-20876
> URL: https://issues.apache.org/jira/browse/HBASE-20876
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: beginner, beginners, newbie
> Fix For: 3.0.0
>
> Attachments: HBASE-20876.master.001.patch
>
>
> In {{HConstants}}, there's a docs snippet:
> {code}
>  /** Don't use it! This'll get you the wrong path in a secure cluster.
>   * Use FileSystem.getHomeDirectory() or
>   * "/user/" + UserGroupInformation.getCurrentUser().getShortUserName()  */
> {code}
> It's ugly style.
> Let's improve this docs with following
> {code}
> /**
>  * Description
>  */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue

2018-07-18 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548707#comment-16548707
 ] 

Jingyun Tian commented on HBASE-20855:
--

[~yuzhih...@gmail.com] Failed tests are flaky ones. They can all passed locally.

> PeerConfigTracker only support one listener will cause problem when there is 
> a recovered replication queue
> --
>
> Key: HBASE-20855
> URL: https://issues.apache.org/jira/browse/HBASE-20855
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 1.5.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-20855.branch-1.001.patch, 
> HBASE-20855.branch-1.002.patch, HBASE-20855.branch-1.003.patch, 
> HBASE-20855.branch-1.004.patch, HBASE-20855.branch-1.005.patch, 
> HBASE-20855.branch-1.006.patch
>
>
> {code}
> public void init(Context context) throws IOException {
>  this.ctx = context;
>  if (this.ctx != null){
>  ReplicationPeer peer = this.ctx.getReplicationPeer();
>  if (peer != null){
>  peer.trackPeerConfigChanges(this);
>  } else {
>  LOG.warn("Not tracking replication peer config changes for Peer Id " + 
> this.ctx.getPeerId() +
>  " because there's no such peer");
>  }
>  }
> }
> {code}
> As we know, replication source will set itself to the PeerConfigTracker in 
> ReplicationPeer. When there is one or more recovered queue, each queue will 
> generate a new replication source, But they share the same ReplicationPeer. 
> Then when it calls setListener, the new generated one will cover the older 
> one. Thus there will only has one ReplicationPeer that receive the peer 
> config change notify.
> {code}
> public synchronized void setListener(ReplicationPeerConfigListener listener){
>  this.listener = listener;
> }
> {code}
>  
> To solve this,  PeerConfigTracker need to support multiple listener and 
> listener should be removed when the replication endpoint terminated.
> I will upload a patch later with fix and UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20565) ColumnRangeFilter combined with ColumnPaginationFilter can produce incorrect result since 1.4

2018-07-18 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548702#comment-16548702
 ] 

Reid Chan commented on HBASE-20565:
---

Can you try your patch with the described test cases in HBASE-20151, just 
wondering since both issues happen on FilterAnd.
It would be good if this patch is general enough to solve both.

> ColumnRangeFilter combined with ColumnPaginationFilter can produce incorrect 
> result since 1.4
> -
>
> Key: HBASE-20565
> URL: https://issues.apache.org/jira/browse/HBASE-20565
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 2.1.0, 1.4.4, 2.0.0
>Reporter: Jerry He
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: HBASE-20565.v1.patch, HBASE-20565.v2.patch, debug.diff, 
> debug.log, test-branch-1.4.patch
>
>
> When ColumnPaginationFilter is combined with ColumnRangeFilter, we may see 
> incorrect result.
> Here is a simple example.
> One row with 10 columns c0, c1, c2, .., c9.  I have a ColumnRangeFilter for 
> range c2 to c9.  Then I have a ColumnPaginationFilter with limit 5 and offset 
> 0.  FileterList is FilterList(Operator.MUST_PASS_ALL, ColumnRangeFilter, 
> ColumnPaginationFilter).
> We expect 5 columns being returned.  But in HBase 1.4 and after, 4 columns 
> are returned.
> In 1.2.x, the correct 5 columns are returned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20401) Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable

2018-07-18 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548695#comment-16548695
 ] 

Reid Chan commented on HBASE-20401:
---

LGTM.

> Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable
> --
>
> Key: HBASE-20401
> URL: https://issues.apache.org/jira/browse/HBASE-20401
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 3.0.0, 1.5.0, 2.0.0-beta-1, 1.4.4, 2.0.0
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-20401.branch-1.001.patch, 
> HBASE-20401.branch-1.002.patch, HBASE-20401.branch-2.001.patch, 
> HBASE-20401.master.001.patch, HBASE-20401.master.002.patch, 
> HBASE-20401.master.003.patch, HBASE-20401.master.004.patch
>
>
> When backporting HBASE-18309 in HBASE-20352, the deleteFiles calls 
> CleanerContext.java#getResult with a waitIfNotFinished timeout to wait for 
> notification (notify) from the fs.delete file thread. there might be two 
> situation need to tune the MAX_WAIT in CleanerContext or waitIfNotFinished 
> when LogClearner call getResult.
>  # fs.delete never complete (strange but possible), then we need to wait for 
> a max of 60 seconds. here, 60 seconds might be too long
>  # getResult is waiting in the period of 500 milliseconds, but the fs.delete 
> has completed and setFromClear is set but yet notify(). one might want to 
> tune this 500 milliseconds to 200 or less .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20856) PITA having to set WAL provider in two places

2018-07-18 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548693#comment-16548693
 ] 

Mingliang Liu commented on HBASE-20856:
---

In test, {{conf}} is static, which makes the newly added tests problematic as 
they seem to assume clean conf object.

Another suggestion is to use different providers in {{testOnlySetWALProvider}} 
and {{testOnlySetMetaWALProvider}}.

> PITA having to set WAL provider in two places
> -
>
> Key: HBASE-20856
> URL: https://issues.apache.org/jira/browse/HBASE-20856
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, wal
>Affects Versions: 3.0.0
>Reporter: stack
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20856.master.001.patch
>
>
> Courtesy of [~elserj], I learn that changing WAL we need to set two places... 
> both hbase.wal.meta_provider and hbase.wal.provider. Operator should only 
> have to set it in one place; hbase.wal.meta_provider should pick up general 
> setting unless hbase.wal.meta_provider is explicitly set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20901) Reducing region replica has no effect

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548689#comment-16548689
 ] 

Hadoop QA commented on HBASE-20901:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 8s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
58s{color} | {color:red} hbase-server: The patch generated 25 new + 52 
unchanged - 28 fixed = 77 total (was 80) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 8s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 32s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
9s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}195m 
18s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}242m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20901 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932121/HBASE-20901_v1.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux e895012eaaed 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8c85763327 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 

[jira] [Commented] (HBASE-20823) Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest

2018-07-18 Thread Reid Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548684#comment-16548684
 ] 

Reid Chan commented on HBASE-20823:
---

[~jackbearden] Please feel free to take it.

> Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest
> --
>
> Key: HBASE-20823
> URL: https://issues.apache.org/jira/browse/HBASE-20823
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Priority: Trivial
>  Labels: beginner
>
> {code}
> /**
>* Builds a {@link RegionSpaceUseReportRequest} protobuf message from the 
> region size map.
>*
>* @param regionSizeStore The size in bytes of regions
>* @return The corresponding protocol buffer message.
>*/
>   RegionSpaceUseReportRequest 
> buildRegionSpaceUseReportRequest(RegionSizeStore regionSizes)
> {code}
> {{@param regionSizeStore}}, but actual name is regionSizes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20401) Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548681#comment-16548681
 ] 

Hadoop QA commented on HBASE-20401:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 5s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 7s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 25s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}195m 57s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}233m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.replication.TestSyncReplicationStandbyKillMaster |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20401 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932154/HBASE-20401.master.003.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 89f5595b7a3f 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8c85763327 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13677/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13677/testReport/ |
| Max. process+thread count | 4827 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |

[jira] [Commented] (HBASE-20565) ColumnRangeFilter combined with ColumnPaginationFilter can produce incorrect result since 1.4

2018-07-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548680#comment-16548680
 ] 

Zheng Hu commented on HBASE-20565:
--

Any other concerns ? [~yuzhih...@gmail.com], [~jinghe], [~anoop.hbase] Thanks. 

> ColumnRangeFilter combined with ColumnPaginationFilter can produce incorrect 
> result since 1.4
> -
>
> Key: HBASE-20565
> URL: https://issues.apache.org/jira/browse/HBASE-20565
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 2.1.0, 1.4.4, 2.0.0
>Reporter: Jerry He
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: HBASE-20565.v1.patch, HBASE-20565.v2.patch, debug.diff, 
> debug.log, test-branch-1.4.patch
>
>
> When ColumnPaginationFilter is combined with ColumnRangeFilter, we may see 
> incorrect result.
> Here is a simple example.
> One row with 10 columns c0, c1, c2, .., c9.  I have a ColumnRangeFilter for 
> range c2 to c9.  Then I have a ColumnPaginationFilter with limit 5 and offset 
> 0.  FileterList is FilterList(Operator.MUST_PASS_ALL, ColumnRangeFilter, 
> ColumnPaginationFilter).
> We expect 5 columns being returned.  But in HBase 1.4 and after, 4 columns 
> are returned.
> In 1.2.x, the correct 5 columns are returned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20565) ColumnRangeFilter combined with ColumnPaginationFilter can produce incorrect result since 1.4

2018-07-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548679#comment-16548679
 ] 

Zheng Hu commented on HBASE-20565:
--

bq. If the order of filters in the FilterList changes, e.g. 
It's the expected behavior.  as I comment above.
bq. Does this mean that the hint is not utilized with the fix ? 
This is because cell pass to filter4 will return NEXT_COL, so no need to pass 
this cell to filter5 & filter6, even though filter5/filter6 may jump to a 
farther place...


> ColumnRangeFilter combined with ColumnPaginationFilter can produce incorrect 
> result since 1.4
> -
>
> Key: HBASE-20565
> URL: https://issues.apache.org/jira/browse/HBASE-20565
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 2.1.0, 1.4.4, 2.0.0
>Reporter: Jerry He
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: HBASE-20565.v1.patch, HBASE-20565.v2.patch, debug.diff, 
> debug.log, test-branch-1.4.patch
>
>
> When ColumnPaginationFilter is combined with ColumnRangeFilter, we may see 
> incorrect result.
> Here is a simple example.
> One row with 10 columns c0, c1, c2, .., c9.  I have a ColumnRangeFilter for 
> range c2 to c9.  Then I have a ColumnPaginationFilter with limit 5 and offset 
> 0.  FileterList is FilterList(Operator.MUST_PASS_ALL, ColumnRangeFilter, 
> ColumnPaginationFilter).
> We expect 5 columns being returned.  But in HBase 1.4 and after, 4 columns 
> are returned.
> In 1.2.x, the correct 5 columns are returned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18659) Use HDFS ACL to give user the ability to read snapshot directly on HDFS

2018-07-18 Thread Yi Mei (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548678#comment-16548678
 ] 

Yi Mei commented on HBASE-18659:


Upload a design doc: 
https://docs.google.com/document/d/1D2iAdbrW5CcKc2SthJBXA1n2tTMTftuVaFtxbOWFuqM/edit?usp=sharing
Comments are appreciative, thanks.

> Use HDFS ACL to give user the ability to read snapshot directly on HDFS
> ---
>
> Key: HBASE-18659
> URL: https://issues.apache.org/jira/browse/HBASE-18659
> Project: HBase
>  Issue Type: New Feature
>Reporter: Duo Zhang
>Assignee: Yi Mei
>Priority: Major
> Attachments: HBASE-18659.master.001.patch, 
> HBASE-18659.master.002.patch, HBASE-18659.master.003.patch, 
> HBASE-18659.master.004.patch
>
>
> On the dev meetup notes in Shenzhen after HBaseCon Asia, there is a topic 
> about the permission to read hfiles on HDFS directly.
> {quote}
> For client-side scanner going against hfiles directly; is there a means of 
> being able to pass the permissions from hbase to hdfs?
> {quote}
> And at Xiaomi we also face the same problem. {{SnapshotScanner}} is much 
> faster and consumes less resources, but only super use has the ability to 
> read hfile directly on HDFS.
> So here we want to use HDFS ACL to address this problem.
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#ACLs_File_System_API
> The basic idea is to set acl and default acl on the ns/table/cf directory on 
> HDFS for the users who have the permission to read the table on HBase.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20905) branch-1 docker build fails

2018-07-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548661#comment-16548661
 ] 

Hudson commented on HBASE-20905:


Results for branch branch-1.4
[build #389 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/389/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/389//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/389//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/389//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> branch-1 docker build fails
> ---
>
> Key: HBASE-20905
> URL: https://issues.apache.org/jira/browse/HBASE-20905
> Project: HBase
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.5.0
>Reporter: Jingyun Tian
>Assignee: Mike Drob
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20905.branch-1.001.patch
>
>
> Docker build for precommit fails:
> {quote}
> 19:08:29 Cleaning up...19:08:29 Command python setup.py egg_info failed with 
> error code 1 in /tmp/pip_build_root/pylint*19:08:29* Storing debug log for 
> failure in /root/.pip/pip.log*19:08:29* The command '/bin/sh -c pip install 
> pylint' returned a non-zero code: 1*19:08:29* 19:08:29 Total Elapsed time: 0m 
> 3s*19:08:29* 19:08:29 ERROR: Docker failed to build image.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20856) PITA having to set WAL provider in two places

2018-07-18 Thread Tak Lon (Stephen) Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tak Lon (Stephen) Wu updated HBASE-20856:
-
Fix Version/s: 3.0.0
Affects Version/s: 3.0.0
   Status: Patch Available  (was: Open)

> PITA having to set WAL provider in two places
> -
>
> Key: HBASE-20856
> URL: https://issues.apache.org/jira/browse/HBASE-20856
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, wal
>Affects Versions: 3.0.0
>Reporter: stack
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20856.master.001.patch
>
>
> Courtesy of [~elserj], I learn that changing WAL we need to set two places... 
> both hbase.wal.meta_provider and hbase.wal.provider. Operator should only 
> have to set it in one place; hbase.wal.meta_provider should pick up general 
> setting unless hbase.wal.meta_provider is explicitly set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20856) PITA having to set WAL provider in two places

2018-07-18 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548637#comment-16548637
 ] 

Tak Lon (Stephen) Wu commented on HBASE-20856:
--

Thanks [~stack], uploaded patch and linked the 
[PR#83|https://github.com/apache/hbase/pull/83] for review.

> PITA having to set WAL provider in two places
> -
>
> Key: HBASE-20856
> URL: https://issues.apache.org/jira/browse/HBASE-20856
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, wal
>Reporter: stack
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20856.master.001.patch
>
>
> Courtesy of [~elserj], I learn that changing WAL we need to set two places... 
> both hbase.wal.meta_provider and hbase.wal.provider. Operator should only 
> have to set it in one place; hbase.wal.meta_provider should pick up general 
> setting unless hbase.wal.meta_provider is explicitly set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20856) PITA having to set WAL provider in two places

2018-07-18 Thread Tak Lon (Stephen) Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tak Lon (Stephen) Wu updated HBASE-20856:
-
Attachment: HBASE-20856.master.001.patch

> PITA having to set WAL provider in two places
> -
>
> Key: HBASE-20856
> URL: https://issues.apache.org/jira/browse/HBASE-20856
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, wal
>Reporter: stack
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20856.master.001.patch
>
>
> Courtesy of [~elserj], I learn that changing WAL we need to set two places... 
> both hbase.wal.meta_provider and hbase.wal.provider. Operator should only 
> have to set it in one place; hbase.wal.meta_provider should pick up general 
> setting unless hbase.wal.meta_provider is explicitly set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20908) Infinite loop on regionserver if region replica are reduced

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548634#comment-16548634
 ] 

Hadoop QA commented on HBASE-20908:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
34s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
10s{color} | {color:red} hbase-server: The patch generated 29 new + 47 
unchanged - 28 fixed = 76 total (was 75) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
37s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 59s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
59s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}117m 
16s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}166m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20908 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932132/HBASE-20908.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux a1e8fe187f7c 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8c85763327 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 

[jira] [Commented] (HBASE-20723) Custom hbase.wal.dir results in data loss because we write recovered edits into a different place than where the recovering region server looks for them

2018-07-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548609#comment-16548609
 ] 

Ted Yu commented on HBASE-20723:


HBASE-17437 was not in 1.2 release.

> Custom hbase.wal.dir results in data loss because we write recovered edits 
> into a different place than where the recovering region server looks for them
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 2.1.0, 1.5.0, 1.4.6
>
> Attachments: 20723.branch-1.txt, 20723.branch-2.txt, 20723.v1.txt, 
> 20723.v10.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 20723.v5.txt, 
> 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20823) Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest

2018-07-18 Thread Jack Bearden (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548597#comment-16548597
 ] 

Jack Bearden commented on HBASE-20823:
--

Hey [~reidchan]! I wanted to put in a patch for this one and a few other HBase 
tickets I found. I am a Hadoop contributor but not yet a contributor for HBase

> Wrong param name in javadoc for HRegionServer#buildRegionSpaceUseReportRequest
> --
>
> Key: HBASE-20823
> URL: https://issues.apache.org/jira/browse/HBASE-20823
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Priority: Trivial
>  Labels: beginner
>
> {code}
> /**
>* Builds a {@link RegionSpaceUseReportRequest} protobuf message from the 
> region size map.
>*
>* @param regionSizeStore The size in bytes of regions
>* @return The corresponding protocol buffer message.
>*/
>   RegionSpaceUseReportRequest 
> buildRegionSpaceUseReportRequest(RegionSizeStore regionSizes)
> {code}
> {{@param regionSizeStore}}, but actual name is regionSizes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) Custom hbase.wal.dir results in data loss because we write recovered edits into a different place than where the recovering region server looks for them

2018-07-18 Thread Sakthi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548560#comment-16548560
 ] 

Sakthi commented on HBASE-20723:


[~yuzhih...@gmail.com] [~mdrob] [~busbey], I see that this hasn't been 
backported to 1.2 yet? Any specific reasons?

> Custom hbase.wal.dir results in data loss because we write recovered edits 
> into a different place than where the recovering region server looks for them
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 2.1.0, 1.5.0, 1.4.6
>
> Attachments: 20723.branch-1.txt, 20723.branch-2.txt, 20723.v1.txt, 
> 20723.v10.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 20723.v5.txt, 
> 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"

2018-07-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548553#comment-16548553
 ] 

stack commented on HBASE-20853:
---

LGTM. Waiting on [~chia7712] +1.

> Polish "Add defaults to Table Interface so Implementors don't have to"
> --
>
> Key: HBASE-20853
> URL: https://issues.apache.org/jira/browse/HBASE-20853
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: stack
>Assignee: Balazs Meszaros
>Priority: Major
>  Labels: beginner, beginners
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20853.master.001.patch, 
> HBASE-20853.master.002.patch, HBASE-20853.master.003.patch, 
> HBASE-20853.master.004.patch, HBASE-20853.master.005.patch
>
>
> This issue is to address feedback that came in after commit on the parent 
> (FYI [~chia7712]). See tail of parent issue and amendment attached to parent 
> adding better defaults to the Table Interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20846) Restore procedure locks when master restarts

2018-07-18 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20846:
--
Attachment: HBASE-20846-v4.patch

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-18 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548550#comment-16548550
 ] 

Vladimir Rodionov commented on HBASE-20894:
---

I do not see how protobuf helps here with  security and, personally, do not 
consider this as a significant security issue for HBase. This can be easily 
prevented, by setting right permissions on a file system access, at least in 
case of serialized data stored in file system. Again, imho.

If you are looking for perfectly secured java serde library, I do not think it 
exists, [~mdrob], otherwise there are a plenty out there, starting with Kryo, 
but I do not think that HBase needs new dependency  only for BucketCache ser/de 
code. 

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20867) RS may get killed while master restarts

2018-07-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548548#comment-16548548
 ] 

stack commented on HBASE-20867:
---

bq. I think we should deal with those kinds of connection exceptions in 
RSProcedureDispatcher and retry the rpc call

The Master is aborting? Why then retry? Or is it that the Master is aborting 
and the retry may or may not happen... better retry than have the RS do an 
abort?

Should extend HBaseIOException: 29  public class ConnectionClosedException 
extends IOException { See 
https://blog.tsunanet.net/2012/04/apache-hadoop-abuse-ioexception.html

We have to do this?

180 } else if (exception instanceof ConnectionClosedException) {
181   return (ConnectionClosedException) new ConnectionClosedException(
182   "Call to " + addr + " failed because " + 
exception).initCause(exception);

Can we not get above info from ChannelHandlerContext where we throw the 
exceptions?

Otherwise, nice patch.


> RS may get killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch, 
> HBASE-20867.branch-2.0.004.patch, HBASE-20867.branch-2.0.005.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20867) RS may get killed while master restarts

2018-07-18 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20867:
--
Fix Version/s: 2.1.1
   2.0.2
   3.0.0

> RS may get killed while master restarts
> ---
>
> Key: HBASE-20867
> URL: https://issues.apache.org/jira/browse/HBASE-20867
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch, 
> HBASE-20867.branch-2.0.004.patch, HBASE-20867.branch-2.0.005.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20401) Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable

2018-07-18 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548509#comment-16548509
 ] 

Tak Lon (Stephen) Wu commented on HBASE-20401:
--

[~yuzhih...@gmail.com] I modified the elapsed duration calculation in both 
classes and updated, thanks.

> Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable
> --
>
> Key: HBASE-20401
> URL: https://issues.apache.org/jira/browse/HBASE-20401
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 3.0.0, 1.5.0, 2.0.0-beta-1, 1.4.4, 2.0.0
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-20401.branch-1.001.patch, 
> HBASE-20401.branch-1.002.patch, HBASE-20401.branch-2.001.patch, 
> HBASE-20401.master.001.patch, HBASE-20401.master.002.patch, 
> HBASE-20401.master.003.patch, HBASE-20401.master.004.patch
>
>
> When backporting HBASE-18309 in HBASE-20352, the deleteFiles calls 
> CleanerContext.java#getResult with a waitIfNotFinished timeout to wait for 
> notification (notify) from the fs.delete file thread. there might be two 
> situation need to tune the MAX_WAIT in CleanerContext or waitIfNotFinished 
> when LogClearner call getResult.
>  # fs.delete never complete (strange but possible), then we need to wait for 
> a max of 60 seconds. here, 60 seconds might be too long
>  # getResult is waiting in the period of 500 milliseconds, but the fs.delete 
> has completed and setFromClear is set but yet notify(). one might want to 
> tune this 500 milliseconds to 200 or less .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20401) Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable

2018-07-18 Thread Tak Lon (Stephen) Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tak Lon (Stephen) Wu updated HBASE-20401:
-
Attachment: HBASE-20401.master.004.patch

> Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable
> --
>
> Key: HBASE-20401
> URL: https://issues.apache.org/jira/browse/HBASE-20401
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 3.0.0, 1.5.0, 2.0.0-beta-1, 1.4.4, 2.0.0
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-20401.branch-1.001.patch, 
> HBASE-20401.branch-1.002.patch, HBASE-20401.branch-2.001.patch, 
> HBASE-20401.master.001.patch, HBASE-20401.master.002.patch, 
> HBASE-20401.master.003.patch, HBASE-20401.master.004.patch
>
>
> When backporting HBASE-18309 in HBASE-20352, the deleteFiles calls 
> CleanerContext.java#getResult with a waitIfNotFinished timeout to wait for 
> notification (notify) from the fs.delete file thread. there might be two 
> situation need to tune the MAX_WAIT in CleanerContext or waitIfNotFinished 
> when LogClearner call getResult.
>  # fs.delete never complete (strange but possible), then we need to wait for 
> a max of 60 seconds. here, 60 seconds might be too long
>  # getResult is waiting in the period of 500 milliseconds, but the fs.delete 
> has completed and setFromClear is set but yet notify(). one might want to 
> tune this 500 milliseconds to 200 or less .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20460) Doc offheap write-path

2018-07-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548503#comment-16548503
 ] 

stack commented on HBASE-20460:
---

Hey [~anoop.hbase]

Missing from the paragraph is note on what an operator should look for to see 
if they have this feature enabled and working properly as well as what metrics 
they can look at it to see effectiveness.

Are there recommendations for good sizings? Do I have to set direct memory 
sizing?

Example configs and logs would help.

Should there be a note on how this relates to IMC?

Thanks.



> Doc offheap write-path
> --
>
> Key: HBASE-20460
> URL: https://issues.apache.org/jira/browse/HBASE-20460
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, Offheaping
>Reporter: stack
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HBASE-20460.001.patch
>
>
> We have an empty section in refguide that needs filling in on how to enable 
> offheap write-path, how to know you've set it up right or not, how to tune 
> it, and how it relates to direct memory allocation and offheap read-path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20886) [Auth] Support keytab login in hbase client

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548458#comment-16548458
 ] 

Hadoop QA commented on HBASE-20886:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
17s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} hbase-common: The patch generated 0 new + 7 
unchanged - 1 fixed = 7 total (was 8) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} The patch hbase-client passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} The patch hbase-server passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
12s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 39s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
31s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 49s{color} 
| {color:red} hbase-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}199m 33s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 0s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}256m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAsyncProcess |
|   | hadoop.hbase.client.TestConnectionImplementation |
|   | hadoop.hbase.TestMetaTableAccessorNoCluster |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b 

[jira] [Commented] (HBASE-20876) Improve docs style in HConstants

2018-07-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548457#comment-16548457
 ] 

Hudson commented on HBASE-20876:


Results for branch master
[build #400 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/400/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/400//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/400//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/400//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
-- Something went wrong with this stage, [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/master/400//console].


> Improve docs style in HConstants
> 
>
> Key: HBASE-20876
> URL: https://issues.apache.org/jira/browse/HBASE-20876
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: beginner, beginners, newbie
> Fix For: 3.0.0
>
> Attachments: HBASE-20876.master.001.patch
>
>
> In {{HConstants}}, there's a docs snippet:
> {code}
>  /** Don't use it! This'll get you the wrong path in a secure cluster.
>   * Use FileSystem.getHomeDirectory() or
>   * "/user/" + UserGroupInformation.getCurrentUser().getShortUserName()  */
> {code}
> It's ugly style.
> Let's improve this docs with following
> {code}
> /**
>  * Description
>  */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-18 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20895:
---
Attachment: HBASE-20895-branch-1.patch

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
> Attachments: HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-18 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548443#comment-16548443
 ] 

Andrew Purtell commented on HBASE-20895:


Here's a patch that doesn't break tests.

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
> Attachments: HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20401) Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable

2018-07-18 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548429#comment-16548429
 ] 

Tak Lon (Stephen) Wu edited comment on HBASE-20401 at 7/18/18 9:33 PM:
---

[~reidchan] I have done updated it with new property keys in {{HFileCleaner}} 
as well (also in the [PR#82|https://github.com/apache/hbase/pull/82]) and 
thanks for letting me know about only provide master branch for review, I'm 
wondered if {{HFileCleaner}} and {{LogCleaner}} can treat the same such that we 
have only introduce two property keys in {{CleanerChore}} only. let me know 
what you think.


was (Author: taklwu):
[~reidchan] I have done updated it with new property keys in {{HFileCleaner}} 
as well (also in the [PR#82|https://github.com/apache/hbase/pull/82]) and 
thanks for letting me know about only provide master branch for review, I'm 
wondered if {{HFileCleaner}} and {{LogCleaner can treat the same such that we 
have only introduce two property keys in CleanerChore}} only. let me know what 
you think.

> Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable
> --
>
> Key: HBASE-20401
> URL: https://issues.apache.org/jira/browse/HBASE-20401
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 3.0.0, 1.5.0, 2.0.0-beta-1, 1.4.4, 2.0.0
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-20401.branch-1.001.patch, 
> HBASE-20401.branch-1.002.patch, HBASE-20401.branch-2.001.patch, 
> HBASE-20401.master.001.patch, HBASE-20401.master.002.patch, 
> HBASE-20401.master.003.patch
>
>
> When backporting HBASE-18309 in HBASE-20352, the deleteFiles calls 
> CleanerContext.java#getResult with a waitIfNotFinished timeout to wait for 
> notification (notify) from the fs.delete file thread. there might be two 
> situation need to tune the MAX_WAIT in CleanerContext or waitIfNotFinished 
> when LogClearner call getResult.
>  # fs.delete never complete (strange but possible), then we need to wait for 
> a max of 60 seconds. here, 60 seconds might be too long
>  # getResult is waiting in the period of 500 milliseconds, but the fs.delete 
> has completed and setFromClear is set but yet notify(). one might want to 
> tune this 500 milliseconds to 200 or less .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20401) Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable

2018-07-18 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548429#comment-16548429
 ] 

Tak Lon (Stephen) Wu edited comment on HBASE-20401 at 7/18/18 9:31 PM:
---

[~reidchan] I have done updated it with new property keys in {{HFileCleaner}} 
as well (also in the [PR#82|https://github.com/apache/hbase/pull/82]) and 
thanks for letting me know about only provide master branch for review, I'm 
wondered if {{HFileCleaner}} and {{LogCleaner can treat the same such that we 
have only introduce two property keys in CleanerChore}} only. let me know what 
you think.


was (Author: taklwu):
[~reidchan] I have done updated it with new property keys in {{{HFileCleaner}}} 
as well (also in the [PR#82|https://github.com/apache/hbase/pull/82]) and 
thanks for letting me know about only provide master branch for review, I'm 
wondered if {{HFileCleaner}} and {{LogCleaner }}can treat the same such that we 
have only introduce two property keys in {{CleanerChore}} only. let me know 
what you think.

> Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable
> --
>
> Key: HBASE-20401
> URL: https://issues.apache.org/jira/browse/HBASE-20401
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 3.0.0, 1.5.0, 2.0.0-beta-1, 1.4.4, 2.0.0
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-20401.branch-1.001.patch, 
> HBASE-20401.branch-1.002.patch, HBASE-20401.branch-2.001.patch, 
> HBASE-20401.master.001.patch, HBASE-20401.master.002.patch, 
> HBASE-20401.master.003.patch
>
>
> When backporting HBASE-18309 in HBASE-20352, the deleteFiles calls 
> CleanerContext.java#getResult with a waitIfNotFinished timeout to wait for 
> notification (notify) from the fs.delete file thread. there might be two 
> situation need to tune the MAX_WAIT in CleanerContext or waitIfNotFinished 
> when LogClearner call getResult.
>  # fs.delete never complete (strange but possible), then we need to wait for 
> a max of 60 seconds. here, 60 seconds might be too long
>  # getResult is waiting in the period of 500 milliseconds, but the fs.delete 
> has completed and setFromClear is set but yet notify(). one might want to 
> tune this 500 milliseconds to 200 or less .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20401) Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable

2018-07-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548428#comment-16548428
 ] 

Ted Yu commented on HBASE-20401:


{code}
  wait(waitIfNotFinished);
  waitTime += waitIfNotFinished;
{code}
The actual duration of wait may be different from waitIfNotFinished.
It would be better to compute the elapsed duration instead of using 
waitIfNotFinished directly in the last line above.

> Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable
> --
>
> Key: HBASE-20401
> URL: https://issues.apache.org/jira/browse/HBASE-20401
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 3.0.0, 1.5.0, 2.0.0-beta-1, 1.4.4, 2.0.0
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-20401.branch-1.001.patch, 
> HBASE-20401.branch-1.002.patch, HBASE-20401.branch-2.001.patch, 
> HBASE-20401.master.001.patch, HBASE-20401.master.002.patch, 
> HBASE-20401.master.003.patch
>
>
> When backporting HBASE-18309 in HBASE-20352, the deleteFiles calls 
> CleanerContext.java#getResult with a waitIfNotFinished timeout to wait for 
> notification (notify) from the fs.delete file thread. there might be two 
> situation need to tune the MAX_WAIT in CleanerContext or waitIfNotFinished 
> when LogClearner call getResult.
>  # fs.delete never complete (strange but possible), then we need to wait for 
> a max of 60 seconds. here, 60 seconds might be too long
>  # getResult is waiting in the period of 500 milliseconds, but the fs.delete 
> has completed and setFromClear is set but yet notify(). one might want to 
> tune this 500 milliseconds to 200 or less .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20401) Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable

2018-07-18 Thread Tak Lon (Stephen) Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548429#comment-16548429
 ] 

Tak Lon (Stephen) Wu commented on HBASE-20401:
--

[~reidchan] I have done updated it with new property keys in {{{HFileCleaner}}} 
as well (also in the [PR#82|https://github.com/apache/hbase/pull/82]) and 
thanks for letting me know about only provide master branch for review, I'm 
wondered if {{HFileCleaner}} and {{LogCleaner }}can treat the same such that we 
have only introduce two property keys in {{CleanerChore}} only. let me know 
what you think.

> Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable
> --
>
> Key: HBASE-20401
> URL: https://issues.apache.org/jira/browse/HBASE-20401
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 3.0.0, 1.5.0, 2.0.0-beta-1, 1.4.4, 2.0.0
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-20401.branch-1.001.patch, 
> HBASE-20401.branch-1.002.patch, HBASE-20401.branch-2.001.patch, 
> HBASE-20401.master.001.patch, HBASE-20401.master.002.patch, 
> HBASE-20401.master.003.patch
>
>
> When backporting HBASE-18309 in HBASE-20352, the deleteFiles calls 
> CleanerContext.java#getResult with a waitIfNotFinished timeout to wait for 
> notification (notify) from the fs.delete file thread. there might be two 
> situation need to tune the MAX_WAIT in CleanerContext or waitIfNotFinished 
> when LogClearner call getResult.
>  # fs.delete never complete (strange but possible), then we need to wait for 
> a max of 60 seconds. here, 60 seconds might be too long
>  # getResult is waiting in the period of 500 milliseconds, but the fs.delete 
> has completed and setFromClear is set but yet notify(). one might want to 
> tune this 500 milliseconds to 200 or less .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20401) Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable

2018-07-18 Thread Tak Lon (Stephen) Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tak Lon (Stephen) Wu updated HBASE-20401:
-
Attachment: HBASE-20401.master.003.patch

> Make `MAX_WAIT` and `waitIfNotFinished` in CleanerContext configurable
> --
>
> Key: HBASE-20401
> URL: https://issues.apache.org/jira/browse/HBASE-20401
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 3.0.0, 1.5.0, 2.0.0-beta-1, 1.4.4, 2.0.0
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-20401.branch-1.001.patch, 
> HBASE-20401.branch-1.002.patch, HBASE-20401.branch-2.001.patch, 
> HBASE-20401.master.001.patch, HBASE-20401.master.002.patch, 
> HBASE-20401.master.003.patch
>
>
> When backporting HBASE-18309 in HBASE-20352, the deleteFiles calls 
> CleanerContext.java#getResult with a waitIfNotFinished timeout to wait for 
> notification (notify) from the fs.delete file thread. there might be two 
> situation need to tune the MAX_WAIT in CleanerContext or waitIfNotFinished 
> when LogClearner call getResult.
>  # fs.delete never complete (strange but possible), then we need to wait for 
> a max of 60 seconds. here, 60 seconds might be too long
>  # getResult is waiting in the period of 500 milliseconds, but the fs.delete 
> has completed and setFromClear is set but yet notify(). one might want to 
> tune this 500 milliseconds to 200 or less .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20565) ColumnRangeFilter combined with ColumnPaginationFilter can produce incorrect result since 1.4

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548415#comment-16548415
 ] 

Hadoop QA commented on HBASE-20565:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
19s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
29s{color} | {color:red} hbase-client: The patch generated 1 new + 22 unchanged 
- 11 fixed = 23 total (was 33) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
20s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m  0s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
7s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}160m 
18s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}212m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20565 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932040/HBASE-20565.v2.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux ac7c75520e6f 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8c85763327 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 

[jira] [Commented] (HBASE-20460) Doc offheap write-path

2018-07-18 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548403#comment-16548403
 ] 

Josh Elser commented on HBASE-20460:


.001

{quote}
    HBASE-20460 Improve off-heap docs for write path

    Based on original docs from Anoop. Also includes hbase-default.xml updates.
{quote}

> Doc offheap write-path
> --
>
> Key: HBASE-20460
> URL: https://issues.apache.org/jira/browse/HBASE-20460
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, Offheaping
>Reporter: stack
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HBASE-20460.001.patch
>
>
> We have an empty section in refguide that needs filling in on how to enable 
> offheap write-path, how to know you've set it up right or not, how to tune 
> it, and how it relates to direct memory allocation and offheap read-path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20460) Doc offheap write-path

2018-07-18 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-20460:
---
Attachment: HBASE-20460.001.patch

> Doc offheap write-path
> --
>
> Key: HBASE-20460
> URL: https://issues.apache.org/jira/browse/HBASE-20460
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, Offheaping
>Reporter: stack
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HBASE-20460.001.patch
>
>
> We have an empty section in refguide that needs filling in on how to enable 
> offheap write-path, how to know you've set it up right or not, how to tune 
> it, and how it relates to direct memory allocation and offheap read-path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20460) Doc offheap write-path

2018-07-18 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-20460:
---
Assignee: Josh Elser  (was: Anoop Sam John)
  Status: Patch Available  (was: Open)

> Doc offheap write-path
> --
>
> Key: HBASE-20460
> URL: https://issues.apache.org/jira/browse/HBASE-20460
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, Offheaping
>Reporter: stack
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HBASE-20460.001.patch
>
>
> We have an empty section in refguide that needs filling in on how to enable 
> offheap write-path, how to know you've set it up right or not, how to tune 
> it, and how it relates to direct memory allocation and offheap read-path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20907) Fix Intermittent failure on TestProcedurePriority

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548378#comment-16548378
 ] 

Hadoop QA commented on HBASE-20907:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
41s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
39s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 19s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}140m 
55s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}186m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20907 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932061/HBASE-20907.v2.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux c6307e8c969e 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8c85763327 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13670/testReport/ |
| Max. process+thread count | 4488 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13670/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Fix Intermittent failure on 

[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-18 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548354#comment-16548354
 ] 

Mike Drob commented on HBASE-20894:
---

Do you have a suggestion of a different library? It's also nice to avoid object 
serialization for security posture - 
https://www.owasp.org/index.php/Deserialization_of_untrusted_data

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20908) Infinite loop on regionserver if region replica are reduced

2018-07-18 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-20908:
--
Status: Patch Available  (was: Open)

> Infinite loop on regionserver if region replica are reduced 
> 
>
> Key: HBASE-20908
> URL: https://issues.apache.org/jira/browse/HBASE-20908
> Project: HBase
>  Issue Type: Bug
>  Components: read replicas
>Affects Versions: 2.0.0, 1.2.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Attachments: HBASE-20908.patch
>
>
> Steps to reproduce
> {code}
> hbase(main):003:0> create 'myTable','cf',{REGION_REPLICATION=>3}
> hbase(main):003:0> put 'myTable','r1','cf:col1','1'
> 0 row(s) in 0.1230 seconds
> hbase(main):004:0> disable 'myTable'
> alter '0 row(s) in 2.3040 seconds
> hbase(main):005:0> alter 'myTable',{REGION_REPLICATION=>1}
> Updating all regions with the new schema...
> 1/1 regions updated.
> Done.
> 0 row(s) in 11.9550 seconds
> hbase(main):006:0> enable 'myTable'
> 0 row(s) in 1.2620 seconds
> hbase(main):007:0> put 'myTable1','r2','cf:col1','1'
> 0 row(s) in 0.0060 seconds
> {code}
> This is the replica region request which will not be present now in Meta but 
> was there in cache. Server will say that he is not serving this region.
> {code}
> com.google.protobuf.ServiceException: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException):
>  org.apache.hadoop.hbase.NotServingRegionException: Region 
> d997d9b47a106216b9b117617ec09015 is not online on 
> 10.22.9.76,16020,1531341039091
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3124)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3106)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:1714)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22773)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
> {code}
> Eventually, when we will update our cache after looking into meta , we will 
> get into an infinite loop as this event will not be replicated because the 
> location of the replica will not appear again.
> {code}
> java.net.SocketTimeoutException: callTimeout=120, callDuration=2181316: 
> Can't get the location null
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:170)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.RegionReplicaReplicationEndpoint$RetryingRpcCallable.call(RegionReplicaReplicationEndpoint.java:606)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't 
> get the location
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:178)
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getLocation(RegionAdminServiceCallable.java:105)
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.prepare(RegionAdminServiceCallable.java:89)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
>   ... 5 more
> Caused by: java.io.IOException: HRegionInfo was null in myTable, 
> row=keyvalues={myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:regioninfo/1531262022425/Put/vlen=41/seqid=0,
>  
> myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:seqnumDuringOpen/1531341209944/Put/vlen=8/seqid=0,
>  
> myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:server/1531341209944/Put/vlen=16/seqid=0,
>  
> myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:serverstartcode/1531341209944/Put/vlen=8/seqid=0}
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1289)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1179)
>   at 
> 

[jira] [Updated] (HBASE-20908) Infinite loop on regionserver if region replica are reduced

2018-07-18 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-20908:
--
Attachment: HBASE-20908.patch

> Infinite loop on regionserver if region replica are reduced 
> 
>
> Key: HBASE-20908
> URL: https://issues.apache.org/jira/browse/HBASE-20908
> Project: HBase
>  Issue Type: Bug
>  Components: read replicas
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Attachments: HBASE-20908.patch
>
>
> Steps to reproduce
> {code}
> hbase(main):003:0> create 'myTable','cf',{REGION_REPLICATION=>3}
> hbase(main):003:0> put 'myTable','r1','cf:col1','1'
> 0 row(s) in 0.1230 seconds
> hbase(main):004:0> disable 'myTable'
> alter '0 row(s) in 2.3040 seconds
> hbase(main):005:0> alter 'myTable',{REGION_REPLICATION=>1}
> Updating all regions with the new schema...
> 1/1 regions updated.
> Done.
> 0 row(s) in 11.9550 seconds
> hbase(main):006:0> enable 'myTable'
> 0 row(s) in 1.2620 seconds
> hbase(main):007:0> put 'myTable1','r2','cf:col1','1'
> 0 row(s) in 0.0060 seconds
> {code}
> This is the replica region request which will not be present now in Meta but 
> was there in cache. Server will say that he is not serving this region.
> {code}
> com.google.protobuf.ServiceException: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException):
>  org.apache.hadoop.hbase.NotServingRegionException: Region 
> d997d9b47a106216b9b117617ec09015 is not online on 
> 10.22.9.76,16020,1531341039091
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3124)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3106)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:1714)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22773)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
> {code}
> Eventually, when we will update our cache after looking into meta , we will 
> get into an infinite loop as this event will not be replicated because the 
> location of the replica will not appear again.
> {code}
> java.net.SocketTimeoutException: callTimeout=120, callDuration=2181316: 
> Can't get the location null
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:170)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.RegionReplicaReplicationEndpoint$RetryingRpcCallable.call(RegionReplicaReplicationEndpoint.java:606)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't 
> get the location
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:178)
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getLocation(RegionAdminServiceCallable.java:105)
>   at 
> org.apache.hadoop.hbase.client.RegionAdminServiceCallable.prepare(RegionAdminServiceCallable.java:89)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
>   ... 5 more
> Caused by: java.io.IOException: HRegionInfo was null in myTable, 
> row=keyvalues={myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:regioninfo/1531262022425/Put/vlen=41/seqid=0,
>  
> myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:seqnumDuringOpen/1531341209944/Put/vlen=8/seqid=0,
>  
> myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:server/1531341209944/Put/vlen=16/seqid=0,
>  
> myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:serverstartcode/1531341209944/Put/vlen=8/seqid=0}
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1289)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1179)
>   at 
> 

[jira] [Commented] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548349#comment-16548349
 ] 

Hadoop QA commented on HBASE-20855:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
57s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
54s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
47s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
40s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 40s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
22s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}113m 11s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}145m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.mapreduce.TestLoadIncrementalHFiles |
|   | 

[jira] [Created] (HBASE-20908) Infinite loop on regionserver if region replica are reduced

2018-07-18 Thread Ankit Singhal (JIRA)
Ankit Singhal created HBASE-20908:
-

 Summary: Infinite loop on regionserver if region replica are 
reduced 
 Key: HBASE-20908
 URL: https://issues.apache.org/jira/browse/HBASE-20908
 Project: HBase
  Issue Type: Bug
  Components: read replicas
Affects Versions: 2.0.0, 1.2.0
Reporter: Ankit Singhal
Assignee: Ankit Singhal


Steps to reproduce
{code}
hbase(main):003:0> create 'myTable','cf',{REGION_REPLICATION=>3}


hbase(main):003:0> put 'myTable','r1','cf:col1','1'
0 row(s) in 0.1230 seconds

hbase(main):004:0> disable 'myTable'
alter '0 row(s) in 2.3040 seconds

hbase(main):005:0> alter 'myTable',{REGION_REPLICATION=>1}
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 11.9550 seconds

hbase(main):006:0> enable 'myTable'
0 row(s) in 1.2620 seconds

hbase(main):007:0> put 'myTable1','r2','cf:col1','1'
0 row(s) in 0.0060 seconds

{code}


This is the replica region request which will not be present now in Meta but 
was there in cache. Server will say that he is not serving this region.
{code}
com.google.protobuf.ServiceException: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException):
 org.apache.hadoop.hbase.NotServingRegionException: Region 
d997d9b47a106216b9b117617ec09015 is not online on 10.22.9.76,16020,1531341039091
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3124)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3106)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:1714)
at 
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22773)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
{code}

Eventually, when we will update our cache after looking into meta , we will get 
into an infinite loop as this event will not be replicated because the location 
of the replica will not appear again.
{code}
java.net.SocketTimeoutException: callTimeout=120, callDuration=2181316: 
Can't get the location null
at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:170)
at 
org.apache.hadoop.hbase.replication.regionserver.RegionReplicaReplicationEndpoint$RetryingRpcCallable.call(RegionReplicaReplicationEndpoint.java:606)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get 
the location
at 
org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:178)
at 
org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getLocation(RegionAdminServiceCallable.java:105)
at 
org.apache.hadoop.hbase.client.RegionAdminServiceCallable.prepare(RegionAdminServiceCallable.java:89)
at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
... 5 more
Caused by: java.io.IOException: HRegionInfo was null in myTable, 
row=keyvalues={myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:regioninfo/1531262022425/Put/vlen=41/seqid=0,
 
myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:seqnumDuringOpen/1531341209944/Put/vlen=8/seqid=0,
 
myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:server/1531341209944/Put/vlen=16/seqid=0,
 
myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:serverstartcode/1531341209944/Put/vlen=8/seqid=0}
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1289)
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1179)
at 
org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:170)
... 8 more

{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-18 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548330#comment-16548330
 ] 

Vladimir Rodionov commented on HBASE-20894:
---

I was asked some times ago to replace all java serialization code in 
Backup module with protobuf stuff. The result:

# generated code increased total code size by 50%
# glue code in the module to serialize/deserialize objects is more complex now

So, in general, I am not a proponent of protobuf, especially for internal (not 
public) API.  So, yes, I do not think that moving *everything* to protobuf is a 
right approach, but, of course, this is only my humble opinion.

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20907) Fix Intermittent failure on TestProcedurePriority

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548316#comment-16548316
 ] 

Hadoop QA commented on HBASE-20907:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
37s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
34s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}116m 
10s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}158m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20907 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932061/HBASE-20907.v2.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 4c5f7858f172 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8c85763327 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13668/testReport/ |
| Max. process+thread count | 4798 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13668/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Fix Intermittent failure on 

[jira] [Commented] (HBASE-20460) Doc offheap write-path

2018-07-18 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548298#comment-16548298
 ] 

Josh Elser commented on HBASE-20460:


Let me pick it up for ya :)

> Doc offheap write-path
> --
>
> Key: HBASE-20460
> URL: https://issues.apache.org/jira/browse/HBASE-20460
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, Offheaping
>Reporter: stack
>Priority: Critical
> Fix For: 2.2.0
>
>
> We have an empty section in refguide that needs filling in on how to enable 
> offheap write-path, how to know you've set it up right or not, how to tune 
> it, and how it relates to direct memory allocation and offheap read-path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-20460) Doc offheap write-path

2018-07-18 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser reassigned HBASE-20460:
--

Assignee: Anoop Sam John

> Doc offheap write-path
> --
>
> Key: HBASE-20460
> URL: https://issues.apache.org/jira/browse/HBASE-20460
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, Offheaping
>Reporter: stack
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.2.0
>
>
> We have an empty section in refguide that needs filling in on how to enable 
> offheap write-path, how to know you've set it up right or not, how to tune 
> it, and how it relates to direct memory allocation and offheap read-path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20901) Reducing region replica has no effect

2018-07-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548294#comment-16548294
 ] 

Ted Yu commented on HBASE-20901:


lgtm, pending QA.

> Reducing region replica has no effect
> -
>
> Key: HBASE-20901
> URL: https://issues.apache.org/jira/browse/HBASE-20901
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
>  Labels: replica
> Attachments: HBASE-20901.patch, HBASE-20901_v1.patch
>
>
> While reducing the region replica, server name(sn) and state column of the 
> replica are not getting deleted, resulting in assignment manager to think 
> that these regions are CLOSED and assign them again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548285#comment-16548285
 ] 

Hadoop QA commented on HBASE-20855:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
47s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
38s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
36s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 35s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
22s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 53s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}123m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 |
| JIRA Issue | 

[jira] [Commented] (HBASE-20901) Reducing region replica has no effect

2018-07-18 Thread Ankit Singhal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548273#comment-16548273
 ] 

Ankit Singhal commented on HBASE-20901:
---

thanks [~yuzhih...@gmail.com] for taking the look.
bq. The new methods can be package private, right ?
I'm using them in the test , so either I can make them default or annotate 
them. In the new patch, I have annotated it , just to have consistency with 
other similar methods.

> Reducing region replica has no effect
> -
>
> Key: HBASE-20901
> URL: https://issues.apache.org/jira/browse/HBASE-20901
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
>  Labels: replica
> Attachments: HBASE-20901.patch, HBASE-20901_v1.patch
>
>
> While reducing the region replica, server name(sn) and state column of the 
> replica are not getting deleted, resulting in assignment manager to think 
> that these regions are CLOSED and assign them again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20901) Reducing region replica has no effect

2018-07-18 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-20901:
--
Attachment: HBASE-20901_v1.patch

> Reducing region replica has no effect
> -
>
> Key: HBASE-20901
> URL: https://issues.apache.org/jira/browse/HBASE-20901
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
>  Labels: replica
> Attachments: HBASE-20901.patch, HBASE-20901_v1.patch
>
>
> While reducing the region replica, server name(sn) and state column of the 
> replica are not getting deleted, resulting in assignment manager to think 
> that these regions are CLOSED and assign them again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-18 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548269#comment-16548269
 ] 

Mike Drob commented on HBASE-20894:
---

Thanks for asking, [~vrodionov].

Java object serialization is very brittle. There's no reason to be storing the 
full object graph when we really just want to be storing some data. This will 
give us more flexibility in the future for what we do and how we persist things.

I'm not going to be a fanatical champion for protobuf here, it just seemed like 
a straightforward solution given that we already have PB for other things. 
Personally, I wouldn't oppose a solution that uses some other format like XML 
or JSON or SequenceFile or whatever.

I'm not too concerned about the performance minutia of this, since it should 
only be happening on startup and shutdown. I also don't think the impact of the 
generated code is too great. I imagine that the space used on disk is going to 
be less with PB than with object serialization, but again, that's not a real 
concern for me either.

I'm having trouble reading if you're opposed to this change or if you are 
trying to understand the motivation, can you clarify?

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf

2018-07-18 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548231#comment-16548231
 ] 

Vladimir Rodionov commented on HBASE-20894:
---

[~md...@cloudera.com], I am going to ping you one more time.

> Move BucketCache from java serialization to protobuf
> 
>
> Key: HBASE-20894
> URL: https://issues.apache.org/jira/browse/HBASE-20894
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Affects Versions: 2.0.0
>Reporter: Mike Drob
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch
>
>
> We should use a better serialization format instead of Java Serialization for 
> the BucketCache entry persistence.
> Suggested by Chris McCown, who does not appear to have a JIRA account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548209#comment-16548209
 ] 

Hadoop QA commented on HBASE-20853:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
36s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
51s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 33s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
13s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 42m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20853 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932087/HBASE-20853.master.005.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 4f048db7229f 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8c85763327 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13671/testReport/ |
| Max. process+thread count | 275 (vs. ulimit of 1) |
| modules | C: hbase-client U: hbase-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/13671/console |
| Powered by 

[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-18 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20895:
---
Attachment: (was: HBASE-20895-branch-1.patch)

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-18 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548187#comment-16548187
 ] 

Andrew Purtell commented on HBASE-20895:


Yikes, removing those patches, that didn't work out. 

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-18 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20895:
---
Attachment: (was: HBASE-20895-branch-1.3.patch)

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20876) Improve docs style in HConstants

2018-07-18 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548172#comment-16548172
 ] 

Wei-Chiu Chuang commented on HBASE-20876:
-

Thanks Reid, sorry I didn't notice your comment. Will make sure to use 
git-format next time.

> Improve docs style in HConstants
> 
>
> Key: HBASE-20876
> URL: https://issues.apache.org/jira/browse/HBASE-20876
> Project: HBase
>  Issue Type: Improvement
>Reporter: Reid Chan
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: beginner, beginners, newbie
> Fix For: 3.0.0
>
> Attachments: HBASE-20876.master.001.patch
>
>
> In {{HConstants}}, there's a docs snippet:
> {code}
>  /** Don't use it! This'll get you the wrong path in a secure cluster.
>   * Use FileSystem.getHomeDirectory() or
>   * "/user/" + UserGroupInformation.getCurrentUser().getShortUserName()  */
> {code}
> It's ugly style.
> Let's improve this docs with following
> {code}
> /**
>  * Description
>  */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20905) branch-1 docker build fails

2018-07-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548122#comment-16548122
 ] 

Hudson commented on HBASE-20905:


Results for branch branch-1
[build #384 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/384/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/384//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/384//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/384//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> branch-1 docker build fails
> ---
>
> Key: HBASE-20905
> URL: https://issues.apache.org/jira/browse/HBASE-20905
> Project: HBase
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.5.0
>Reporter: Jingyun Tian
>Assignee: Mike Drob
>Priority: Major
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20905.branch-1.001.patch
>
>
> Docker build for precommit fails:
> {quote}
> 19:08:29 Cleaning up...19:08:29 Command python setup.py egg_info failed with 
> error code 1 in /tmp/pip_build_root/pylint*19:08:29* Storing debug log for 
> failure in /root/.pip/pip.log*19:08:29* The command '/bin/sh -c pip install 
> pylint' returned a non-zero code: 1*19:08:29* 19:08:29 Total Elapsed time: 0m 
> 3s*19:08:29* 19:08:29 ERROR: Docker failed to build image.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue

2018-07-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548108#comment-16548108
 ] 

Ted Yu commented on HBASE-20855:


Triggered another QA run:
https://builds.apache.org/job/PreCommit-HBASE-Build/13669/

> PeerConfigTracker only support one listener will cause problem when there is 
> a recovered replication queue
> --
>
> Key: HBASE-20855
> URL: https://issues.apache.org/jira/browse/HBASE-20855
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 1.5.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-20855.branch-1.001.patch, 
> HBASE-20855.branch-1.002.patch, HBASE-20855.branch-1.003.patch, 
> HBASE-20855.branch-1.004.patch, HBASE-20855.branch-1.005.patch, 
> HBASE-20855.branch-1.006.patch
>
>
> {code}
> public void init(Context context) throws IOException {
>  this.ctx = context;
>  if (this.ctx != null){
>  ReplicationPeer peer = this.ctx.getReplicationPeer();
>  if (peer != null){
>  peer.trackPeerConfigChanges(this);
>  } else {
>  LOG.warn("Not tracking replication peer config changes for Peer Id " + 
> this.ctx.getPeerId() +
>  " because there's no such peer");
>  }
>  }
> }
> {code}
> As we know, replication source will set itself to the PeerConfigTracker in 
> ReplicationPeer. When there is one or more recovered queue, each queue will 
> generate a new replication source, But they share the same ReplicationPeer. 
> Then when it calls setListener, the new generated one will cover the older 
> one. Thus there will only has one ReplicationPeer that receive the peer 
> config change notify.
> {code}
> public synchronized void setListener(ReplicationPeerConfigListener listener){
>  this.listener = listener;
> }
> {code}
>  
> To solve this,  PeerConfigTracker need to support multiple listener and 
> listener should be removed when the replication endpoint terminated.
> I will upload a patch later with fix and UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20257) hbase-spark should not depend on com.google.code.findbugs.jsr305

2018-07-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548077#comment-16548077
 ] 

Ted Yu commented on HBASE-20257:


[~busbey]:
What do you think of the latest patch ?

> hbase-spark should not depend on com.google.code.findbugs.jsr305
> 
>
> Key: HBASE-20257
> URL: https://issues.apache.org/jira/browse/HBASE-20257
> Project: HBase
>  Issue Type: Task
>  Components: build, spark
>Affects Versions: 3.0.0
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-20257.v01.patch, HBASE-20257.v02.patch, 
> HBASE-20257.v03.patch, HBASE-20257.v04.patch, HBASE-20257.v05.patch
>
>
> The following can be observed in the build output of master branch:
> {code}
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.BannedDependencies failed 
> with message:
> We don't allow the JSR305 jar from the Findbugs project, see HBASE-16321.
> Found Banned Dependency: com.google.code.findbugs:jsr305:jar:1.3.9
> Use 'mvn dependency:tree' to locate the source of the banned dependencies.
> {code}
> Here is related snippet from hbase-spark/pom.xml:
> {code}
> 
>   com.google.code.findbugs
>   jsr305
> {code}
> Dependency on jsr305 should be dropped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-18 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548015#comment-16548015
 ] 

Andrew Purtell commented on HBASE-20895:


Here are two WIP patches that take a similar approach to what we did on 
HBASE-14050 but rather than leaving the bytebuffer reference in place (comments 
indicate we are trying to get it GCed by nulling the reference) use an atomic 
reference to avoid use of a reference by one thread after a null has been 
stored back to it by another. 

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
> Attachments: HBASE-20895-branch-1.3.patch, HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess

2018-07-18 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-20895:
---
Attachment: HBASE-20895-branch-1.patch
HBASE-20895-branch-1.3.patch

> NPE in RpcServer#readAndProcess
> ---
>
> Key: HBASE-20895
> URL: https://issues.apache.org/jira/browse/HBASE-20895
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.3.2
>Reporter: Andrew Purtell
>Assignee: Monani Mihir
>Priority: Major
> Fix For: 1.5.0, 1.3.3, 1.4.6
>
> Attachments: HBASE-20895-branch-1.3.patch, HBASE-20895-branch-1.patch
>
>
> {noformat}
> 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - 
> RpcServer.listener,port=60020: Caught exception while reading:
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This looks like it could be a use after close problem if there is concurrent 
> access to a Connection.
> In process() we might store a null back to the 'data' field.
> Meanwhile in readAndProcess() we have a case where we might be blocked on a 
> channel read and then after coming back from the read we go to use 'data' 
> after a null has been written back, leading to a NPE.
> {quote}count = channelRead(channel, data);
>  1761 ---> if (count >= 0 && *data.remaining()* == 0)
>  \{ process(); }{quote}
> Whether a NPE happens or not is going to depend on the timing of the store 
> back to 'data' in another thread and use of 'data' in this thread and whether 
> or not the JVM has optimized away a reload of 'data' (it's not declared 
> volatile)
> We should do a null check here just to be defensive. We should also look at 
> whether concurrent access to the Connection is happening and intended.The 
> above is just a theory. We should also look at other execution sequences that 
> could lead to 'data' being null in this location. At a glance I didn't find 
> one but the store to 'data' happens behind conditionals so it is possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548010#comment-16548010
 ] 

stack commented on HBASE-20893:
---

bq. This bug is introduced by AMv2.

Yeah. I went back and looked at branch-2 code. Yeah, RS will crash out if it 
cannot effect smooth transition and then the Master is supposed to be able to 
do fix-up on any condition it may find. Thanks.

> Data loss if splitting region while ServerCrashProcedure executing
> --
>
> Key: HBASE-20893
> URL: https://issues.apache.org/jira/browse/HBASE-20893
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20893.branch-2.0.001.patch, 
> HBASE-20893.branch-2.0.002.patch
>
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue

2018-07-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547999#comment-16547999
 ] 

Ted Yu commented on HBASE-20855:


See INFRA-16784

> PeerConfigTracker only support one listener will cause problem when there is 
> a recovered replication queue
> --
>
> Key: HBASE-20855
> URL: https://issues.apache.org/jira/browse/HBASE-20855
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 1.5.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-20855.branch-1.001.patch, 
> HBASE-20855.branch-1.002.patch, HBASE-20855.branch-1.003.patch, 
> HBASE-20855.branch-1.004.patch, HBASE-20855.branch-1.005.patch, 
> HBASE-20855.branch-1.006.patch
>
>
> {code}
> public void init(Context context) throws IOException {
>  this.ctx = context;
>  if (this.ctx != null){
>  ReplicationPeer peer = this.ctx.getReplicationPeer();
>  if (peer != null){
>  peer.trackPeerConfigChanges(this);
>  } else {
>  LOG.warn("Not tracking replication peer config changes for Peer Id " + 
> this.ctx.getPeerId() +
>  " because there's no such peer");
>  }
>  }
> }
> {code}
> As we know, replication source will set itself to the PeerConfigTracker in 
> ReplicationPeer. When there is one or more recovered queue, each queue will 
> generate a new replication source, But they share the same ReplicationPeer. 
> Then when it calls setListener, the new generated one will cover the older 
> one. Thus there will only has one ReplicationPeer that receive the peer 
> config change notify.
> {code}
> public synchronized void setListener(ReplicationPeerConfigListener listener){
>  this.listener = listener;
> }
> {code}
>  
> To solve this,  PeerConfigTracker need to support multiple listener and 
> listener should be removed when the replication endpoint terminated.
> I will upload a patch later with fix and UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue

2018-07-18 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547985#comment-16547985
 ] 

Jingyun Tian commented on HBASE-20855:
--

[~mdrob] Sir, my new patch can not trigger test. Can you help check why this 
happen?

> PeerConfigTracker only support one listener will cause problem when there is 
> a recovered replication queue
> --
>
> Key: HBASE-20855
> URL: https://issues.apache.org/jira/browse/HBASE-20855
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.4.0, 1.5.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-20855.branch-1.001.patch, 
> HBASE-20855.branch-1.002.patch, HBASE-20855.branch-1.003.patch, 
> HBASE-20855.branch-1.004.patch, HBASE-20855.branch-1.005.patch, 
> HBASE-20855.branch-1.006.patch
>
>
> {code}
> public void init(Context context) throws IOException {
>  this.ctx = context;
>  if (this.ctx != null){
>  ReplicationPeer peer = this.ctx.getReplicationPeer();
>  if (peer != null){
>  peer.trackPeerConfigChanges(this);
>  } else {
>  LOG.warn("Not tracking replication peer config changes for Peer Id " + 
> this.ctx.getPeerId() +
>  " because there's no such peer");
>  }
>  }
> }
> {code}
> As we know, replication source will set itself to the PeerConfigTracker in 
> ReplicationPeer. When there is one or more recovered queue, each queue will 
> generate a new replication source, But they share the same ReplicationPeer. 
> Then when it calls setListener, the new generated one will cover the older 
> one. Thus there will only has one ReplicationPeer that receive the peer 
> config change notify.
> {code}
> public synchronized void setListener(ReplicationPeerConfigListener listener){
>  this.listener = listener;
> }
> {code}
>  
> To solve this,  PeerConfigTracker need to support multiple listener and 
> listener should be removed when the replication endpoint terminated.
> I will upload a patch later with fix and UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"

2018-07-18 Thread Balazs Meszaros (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547964#comment-16547964
 ] 

Balazs Meszaros edited comment on HBASE-20853 at 7/18/18 3:23 PM:
--

Let me revert the delete function [~chia7712]. The problem is 
{{delete(List)}} does not throw an exception if the row is missing, but 
this behavior is not defined for {{delete(Delete)}}.

Other functions seem to be ok.


was (Author: balazs.meszaros):
Let me revert the delete function [~chia7712]. The problem is 
{{delete(List)}} does not throw an exception if the row is missing, but 
this behavior is not defined for {{delete(Delete)}}.

> Polish "Add defaults to Table Interface so Implementors don't have to"
> --
>
> Key: HBASE-20853
> URL: https://issues.apache.org/jira/browse/HBASE-20853
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: stack
>Assignee: Balazs Meszaros
>Priority: Major
>  Labels: beginner, beginners
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20853.master.001.patch, 
> HBASE-20853.master.002.patch, HBASE-20853.master.003.patch, 
> HBASE-20853.master.004.patch, HBASE-20853.master.005.patch
>
>
> This issue is to address feedback that came in after commit on the parent 
> (FYI [~chia7712]). See tail of parent issue and amendment attached to parent 
> adding better defaults to the Table Interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"

2018-07-18 Thread Balazs Meszaros (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547964#comment-16547964
 ] 

Balazs Meszaros commented on HBASE-20853:
-

Let me revert the delete function [~chia7712]. The problem is 
{{delete(List)}} does not throw an exception if the row is missing, but 
this behavior is not defined for {{delete(Delete)}}.

> Polish "Add defaults to Table Interface so Implementors don't have to"
> --
>
> Key: HBASE-20853
> URL: https://issues.apache.org/jira/browse/HBASE-20853
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: stack
>Assignee: Balazs Meszaros
>Priority: Major
>  Labels: beginner, beginners
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20853.master.001.patch, 
> HBASE-20853.master.002.patch, HBASE-20853.master.003.patch, 
> HBASE-20853.master.004.patch, HBASE-20853.master.005.patch
>
>
> This issue is to address feedback that came in after commit on the parent 
> (FYI [~chia7712]). See tail of parent issue and amendment attached to parent 
> adding better defaults to the Table Interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"

2018-07-18 Thread Balazs Meszaros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balazs Meszaros updated HBASE-20853:

Attachment: HBASE-20853.master.005.patch

> Polish "Add defaults to Table Interface so Implementors don't have to"
> --
>
> Key: HBASE-20853
> URL: https://issues.apache.org/jira/browse/HBASE-20853
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: stack
>Assignee: Balazs Meszaros
>Priority: Major
>  Labels: beginner, beginners
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20853.master.001.patch, 
> HBASE-20853.master.002.patch, HBASE-20853.master.003.patch, 
> HBASE-20853.master.004.patch, HBASE-20853.master.005.patch
>
>
> This issue is to address feedback that came in after commit on the parent 
> (FYI [~chia7712]). See tail of parent issue and amendment attached to parent 
> adding better defaults to the Table Interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-18 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547959#comment-16547959
 ] 

Allan Yang edited comment on HBASE-20893 at 7/18/18 3:17 PM:
-

{quote}
I'm thinking this issue has always been present. What you lads thin?
{quote}
In HBase-1.x, split happens in RS, if RS crashes, master will clean up the 
mess(roll back). This bug is introduced by AMv2.


was (Author: allan163):
{code}
I'm thinking this issue has always been present. What you lads thin?
{code}
In HBase-1.x, split happens in RS, if RS crashes, master will clean up the 
mess(roll back). This bug is introduced by AMv2.

> Data loss if splitting region while ServerCrashProcedure executing
> --
>
> Key: HBASE-20893
> URL: https://issues.apache.org/jira/browse/HBASE-20893
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20893.branch-2.0.001.patch, 
> HBASE-20893.branch-2.0.002.patch
>
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing

2018-07-18 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547959#comment-16547959
 ] 

Allan Yang commented on HBASE-20893:


{code}
I'm thinking this issue has always been present. What you lads thin?
{code}
In HBase-1.x, split happens in RS, if RS crashes, master will clean up the 
mess(roll back). This bug is introduced by AMv2.

> Data loss if splitting region while ServerCrashProcedure executing
> --
>
> Key: HBASE-20893
> URL: https://issues.apache.org/jira/browse/HBASE-20893
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-20893.branch-2.0.001.patch, 
> HBASE-20893.branch-2.0.002.patch
>
>
> Similar case as HBASE-20878.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64

2018-07-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547947#comment-16547947
 ] 

stack commented on HBASE-20884:
---

Thanks. Added minor note to RN.

> Replace usage of our Base64 implementation with java.util.Base64
> 
>
> Key: HBASE-20884
> URL: https://issues.apache.org/jira/browse/HBASE-20884
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1
>
> Attachments: HBASE-20884.branch-1.001.patch, 
> HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch
>
>
> We have a public domain implementation of Base64 that is copied into our code 
> base and infrequently receives updates. We should replace usage of that with 
> the new Java 8 java.util.Base64 where possible.
> For the migration, I propose a phased approach.
> * Deprecate on 1.x and 2.x to signal to users that this is going away.
> * Replace usages on branch-2 and master with j.u.Base64
> * Delete our implementation of Base64 on master.
> Does this seem in line with our API compatibility requirements?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64

2018-07-18 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20884:
--
Release Note: 
Class org.apache.hadoop.hbase.util.Base64 has been removed in it's entirety 
from HBase 2+. In HBase 1, unused methods have been removed from the class and 
the audience was changed from  Public to Private. This class was originally 
intended as an internal utility class that could be used externally but 
thinking since changed; these classes should not have been advertised as public 
to end-users.

This represents an incompatible change for users who relied on this 
implementation. An alternative implementation for affected clients is available 
at java.util.Base64 when using Java 8 or newer; be aware, it may encode/decode 
differently. For clients seeking to restore this specific implementation, it is 
available in the public domain for download at 
http://iharder.sourceforge.net/current/java/base64/



  was:
Class org.apache.hadoop.hbase.util.Base64 has been removed in it's entirety 
from HBase 2+. In HBase 1, unused methods have been removed from the class and 
the audience was changed from  Public to Private. This class was originally 
intended as an internal utility class since inception, and should not have been 
advertised as public to end-users.

This represents an incompatible change for users who relied on this 
implementation. An alternative implementation for affected clients is available 
at java.util.Base64 when using Java 8 or newer. For clients seeking to restore 
this specific implementation, it is available in the public domain for download 
at http://iharder.sourceforge.net/current/java/base64/




> Replace usage of our Base64 implementation with java.util.Base64
> 
>
> Key: HBASE-20884
> URL: https://issues.apache.org/jira/browse/HBASE-20884
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1
>
> Attachments: HBASE-20884.branch-1.001.patch, 
> HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch
>
>
> We have a public domain implementation of Base64 that is copied into our code 
> base and infrequently receives updates. We should replace usage of that with 
> the new Java 8 java.util.Base64 where possible.
> For the migration, I propose a phased approach.
> * Deprecate on 1.x and 2.x to signal to users that this is going away.
> * Replace usages on branch-2 and master with j.u.Base64
> * Delete our implementation of Base64 on master.
> Does this seem in line with our API compatibility requirements?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64

2018-07-18 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547922#comment-16547922
 ] 

Mike Drob commented on HBASE-20884:
---

bq. Out of interest, any differences between our old encoding and this new one?
The old one would add new lines after 76 characters to be MIME-encoding 
compatible (with comments in the implementation that this wasn't strictly 
base64 encoding compatible), the new one doesn't do this. We were stripping out 
the line breaks before anyway.

> Replace usage of our Base64 implementation with java.util.Base64
> 
>
> Key: HBASE-20884
> URL: https://issues.apache.org/jira/browse/HBASE-20884
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1
>
> Attachments: HBASE-20884.branch-1.001.patch, 
> HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch
>
>
> We have a public domain implementation of Base64 that is copied into our code 
> base and infrequently receives updates. We should replace usage of that with 
> the new Java 8 java.util.Base64 where possible.
> For the migration, I propose a phased approach.
> * Deprecate on 1.x and 2.x to signal to users that this is going away.
> * Replace usages on branch-2 and master with j.u.Base64
> * Delete our implementation of Base64 on master.
> Does this seem in line with our API compatibility requirements?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20828) Finish-up AMv2 Design/List of Tenets/Specification of operation

2018-07-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547915#comment-16547915
 ] 

stack commented on HBASE-20828:
---

Another idea (From [~Apache9] down in HBASE-20878: "Maybe we could introduce a 
state called ABNORMALLY_CLOSED, which indicates that the region will be 
processed by SCP."

Also, explain lasthost in hbase:meta. See comment in HBASE-20878 "The lastHost 
should not be used for critical condition... (See HBASE-20792)". Or, explain 
all fields in hbase:meta and how they change -- after consideration (e.g. in 
the case of info:sn/lasthost, the prescription is not to rely on it).



> Finish-up AMv2 Design/List of Tenets/Specification of operation
> ---
>
> Key: HBASE-20828
> URL: https://issues.apache.org/jira/browse/HBASE-20828
> Project: HBase
>  Issue Type: Umbrella
>  Components: amv2
>Reporter: stack
>Priority: Major
>
> AMv2 is missing specification. There are too many grey-areas still. Also 
> missing are a concise listing of the tenets of AMv2 operation. Here are some 
> examples:
>  * HBASE-19529 "Handle null states in AM": Asks how we should treat null 
> state in hbase:meta. What does it 'mean'. We seem to treat it differently 
> dependent on context. Needs clarification. [~Apache9] recently asked similar 
> about the meaning of OFFLINE.
>  * Logging needs to have a particular form to help trace Procedure progress; 
> needs a write-up.
> Lets fill in items to address in this umbrella issue. Can address in 
> subissues and produce specification doc too. We have the below but these are 
> mostly (incomplete) description for devs on pv2 and amv2; the specification 
> is missing:
> http://hbase.apache.org/book.html#pv2
> http://hbase.apache.org/book.html#amv2
> (Other areas include addressing what is up w/ rollback -- when, how much, and 
> when it is not appropriate -- as well as recommendation on Procedures 
> coarseness, locking -- is it ok to lock table in alter table procedure for 
> the life of the procedure? -- and so on).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20907) Fix Intermittent failure on TestProcedurePriority

2018-07-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547648#comment-16547648
 ] 

Ted Yu commented on HBASE-20907:


+1

> Fix Intermittent failure on TestProcedurePriority
> -
>
> Key: HBASE-20907
> URL: https://issues.apache.org/jira/browse/HBASE-20907
> Project: HBase
>  Issue Type: Test
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20907.patch, HBASE-20907.v2.patch
>
>
> From a local UT check against 2.1.0-RC1, HMaster failed to initialize before 
> time out. Checking the test log we could see below message:
> {noformat}
> 2018-07-17 20:06:37,142 DEBUG [Thread-4003] 
> client.RpcRetryingCallerImpl(131): Call exception, tries=6, retries=6, 
> started=4173 ms ago, cancelled=false, msg=java.io.IOException: Inject error
> at 
> org.apache.hadoop.hbase.master.procedure.TestProcedurePriority$MyCP.preGetOp(TestProcedurePriority.java:92)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:841)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:838)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preGet(RegionCoprocessorHost.java:838)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2520)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2460)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> , details=row 'hbase:namespace' on table 'hbase:meta' at 
> region=hbase:meta,,1.1588230740, 
> hostname=hdpdevm1.et2sqa.tbsite.net,59254,1531829189215, seqNum=-1, 
> exception=java.io.IOException: java.io.IOException: Inject error
> at 
> org.apache.hadoop.hbase.master.procedure.TestProcedurePriority$MyCP.preGetOp(TestProcedurePriority.java:92)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:841)
> ...
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:386)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:360)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getTableState(MetaTableAccessor.java:1078)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:403)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:94)
> {noformat}
> In current test code we will set {{FAIL}} to true w/o checking whether 
> namespace manager is already up, and if not lucky we will run into the above 
> case and get a timeout.
> The fix will be quite straight forward.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20907) Fix Intermittent failure on TestProcedurePriority

2018-07-18 Thread Yu Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547644#comment-16547644
 ] 

Yu Li commented on HBASE-20907:
---

bq. Is the above a little cleaner?
Yes, we could make it this way. Please note that the failure won't always 
reproduce so it's hard to verify the patch through UT, but in theory it's a 
valid fix.

Patch v2 adopts the review suggestion.

> Fix Intermittent failure on TestProcedurePriority
> -
>
> Key: HBASE-20907
> URL: https://issues.apache.org/jira/browse/HBASE-20907
> Project: HBase
>  Issue Type: Test
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20907.patch, HBASE-20907.v2.patch
>
>
> From a local UT check against 2.1.0-RC1, HMaster failed to initialize before 
> time out. Checking the test log we could see below message:
> {noformat}
> 2018-07-17 20:06:37,142 DEBUG [Thread-4003] 
> client.RpcRetryingCallerImpl(131): Call exception, tries=6, retries=6, 
> started=4173 ms ago, cancelled=false, msg=java.io.IOException: Inject error
> at 
> org.apache.hadoop.hbase.master.procedure.TestProcedurePriority$MyCP.preGetOp(TestProcedurePriority.java:92)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:841)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:838)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preGet(RegionCoprocessorHost.java:838)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2520)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2460)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> , details=row 'hbase:namespace' on table 'hbase:meta' at 
> region=hbase:meta,,1.1588230740, 
> hostname=hdpdevm1.et2sqa.tbsite.net,59254,1531829189215, seqNum=-1, 
> exception=java.io.IOException: java.io.IOException: Inject error
> at 
> org.apache.hadoop.hbase.master.procedure.TestProcedurePriority$MyCP.preGetOp(TestProcedurePriority.java:92)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:841)
> ...
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:386)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:360)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getTableState(MetaTableAccessor.java:1078)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:403)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:94)
> {noformat}
> In current test code we will set {{FAIL}} to true w/o checking whether 
> namespace manager is already up, and if not lucky we will run into the above 
> case and get a timeout.
> The fix will be quite straight forward.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20907) Fix Intermittent failure on TestProcedurePriority

2018-07-18 Thread Yu Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-20907:
--
Attachment: HBASE-20907.v2.patch

> Fix Intermittent failure on TestProcedurePriority
> -
>
> Key: HBASE-20907
> URL: https://issues.apache.org/jira/browse/HBASE-20907
> Project: HBase
>  Issue Type: Test
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20907.patch, HBASE-20907.v2.patch
>
>
> From a local UT check against 2.1.0-RC1, HMaster failed to initialize before 
> time out. Checking the test log we could see below message:
> {noformat}
> 2018-07-17 20:06:37,142 DEBUG [Thread-4003] 
> client.RpcRetryingCallerImpl(131): Call exception, tries=6, retries=6, 
> started=4173 ms ago, cancelled=false, msg=java.io.IOException: Inject error
> at 
> org.apache.hadoop.hbase.master.procedure.TestProcedurePriority$MyCP.preGetOp(TestProcedurePriority.java:92)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:841)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:838)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preGet(RegionCoprocessorHost.java:838)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2520)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2460)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> , details=row 'hbase:namespace' on table 'hbase:meta' at 
> region=hbase:meta,,1.1588230740, 
> hostname=hdpdevm1.et2sqa.tbsite.net,59254,1531829189215, seqNum=-1, 
> exception=java.io.IOException: java.io.IOException: Inject error
> at 
> org.apache.hadoop.hbase.master.procedure.TestProcedurePriority$MyCP.preGetOp(TestProcedurePriority.java:92)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$19.call(RegionCoprocessorHost.java:841)
> ...
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:386)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:360)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.getTableState(MetaTableAccessor.java:1078)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:403)
> at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:94)
> {noformat}
> In current test code we will set {{FAIL}} to true w/o checking whether 
> namespace manager is already up, and if not lucky we will run into the above 
> case and get a timeout.
> The fix will be quite straight forward.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >