[jira] [Created] (HBASE-27035) failed to set file permission when node crash

2022-05-14 Thread lujie (Jira)
lujie created HBASE-27035:
-

 Summary: failed to set file permission  when node crash
 Key: HBASE-27035
 URL: https://issues.apache.org/jira/browse/HBASE-27035
 Project: HBase
  Issue Type: Bug
Reporter: lujie


 in SecureBulkLoadManager#secureBulkLoadHFiles, we have code like that:
{code:java}
for(Pair el: familyPaths) {
              Path stageFamily = new Path(bulkToken, 
Bytes.toString(el.getFirst()));
              if(!fs.exists(stageFamily)) {
                fs.mkdirs(stageFamily);
                fs.setPermission(stageFamily, PERM_ALL_ACCESS);
             }
} {code}
if process crashbefore setpermission, and reboot, we can't setpermission again.

 

we should make this code like 
SnapshotScannerHDFSAclHelper#setCommonDirectoryPermission

 
{code:java}
 for (Path path : paths) {
      createDirIfNotExist(path);
      fs.setPermission(path, new FsPermission(
          conf.get(COMMON_DIRECTORY_PERMISSION, 
COMMON_DIRECTORY_PERMISSION_DEFAULT)));
    } {code}
 

 

 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HBASE-25877) Add access check for switchCompaction

2021-05-11 Thread lujie (Jira)
lujie created HBASE-25877:
-

 Summary: Add access  check for switchCompaction
 Key: HBASE-25877
 URL: https://issues.apache.org/jira/browse/HBASE-25877
 Project: HBase
  Issue Type: Bug
Reporter: lujie


Should we add access check for 
org.apache.hadoop.hbase.regionserver.CompactSplit.switchCompaction?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25558) Adding audit log for execMasterService

2021-02-08 Thread lujie (Jira)
lujie created HBASE-25558:
-

 Summary: Adding audit log for execMasterService
 Key: HBASE-25558
 URL: https://issues.apache.org/jira/browse/HBASE-25558
 Project: HBase
  Issue Type: Bug
Reporter: lujie






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25422) update_all_config should not be executed by non-admin user!!!

2020-12-30 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie resolved HBASE-25422.
---
Resolution: Duplicate

> update_all_config should not be executed by non-admin user!!!
> -
>
> Key: HBASE-25422
> URL: https://issues.apache.org/jira/browse/HBASE-25422
> Project: HBase
>  Issue Type: Bug
>Reporter: lujie
>Priority: Critical
> Attachments: image-2020-12-20-12-50-23-433.png
>
>
> !image-2020-12-20-12-50-23-433.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25456) setRegionStateInMeta need security check

2020-12-30 Thread lujie (Jira)
lujie created HBASE-25456:
-

 Summary: setRegionStateInMeta need security check
 Key: HBASE-25456
 URL: https://issues.apache.org/jira/browse/HBASE-25456
 Project: HBase
  Issue Type: Bug
Reporter: lujie
Assignee: lujie






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25441) Unauthorized client can shutdown the regionserver

2020-12-23 Thread lujie (Jira)
lujie created HBASE-25441:
-

 Summary: Unauthorized client can shutdown the regionserver
 Key: HBASE-25441
 URL: https://issues.apache.org/jira/browse/HBASE-25441
 Project: HBase
  Issue Type: Bug
Reporter: lujie






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-25432) we should add security checks for setTableStateInMeta

2020-12-22 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie reopened HBASE-25432:
---

> we should add security checks for setTableStateInMeta
> -
>
> Key: HBASE-25432
> URL: https://issues.apache.org/jira/browse/HBASE-25432
> Project: HBase
>  Issue Type: Bug
>Reporter: lujie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25432) we should add security checks for list_namespace_tables

2020-12-22 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie resolved HBASE-25432.
---
Resolution: Not A Problem

> we should add security checks for list_namespace_tables
> ---
>
> Key: HBASE-25432
> URL: https://issues.apache.org/jira/browse/HBASE-25432
> Project: HBase
>  Issue Type: Bug
>Reporter: lujie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25432) we should add missing security checks for list_namespace_tables and listTableDescriptorsByNamespace

2020-12-21 Thread lujie (Jira)
lujie created HBASE-25432:
-

 Summary: we should add missing security checks for 
list_namespace_tables and listTableDescriptorsByNamespace
 Key: HBASE-25432
 URL: https://issues.apache.org/jira/browse/HBASE-25432
 Project: HBase
  Issue Type: Bug
Reporter: lujie






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25422) update_all_config can be executed by non-admin user

2020-12-19 Thread lujie (Jira)
lujie created HBASE-25422:
-

 Summary: update_all_config can be executed by non-admin user
 Key: HBASE-25422
 URL: https://issues.apache.org/jira/browse/HBASE-25422
 Project: HBase
  Issue Type: Bug
Reporter: lujie
 Attachments: image-2020-12-20-12-50-23-433.png

!image-2020-12-20-12-50-23-433.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25407) list_regions make potential sensitive information disclosure

2020-12-17 Thread lujie (Jira)
lujie created HBASE-25407:
-

 Summary: list_regions make potential sensitive information 
disclosure
 Key: HBASE-25407
 URL: https://issues.apache.org/jira/browse/HBASE-25407
 Project: HBase
  Issue Type: Bug
Reporter: lujie
 Attachments: image-2020-12-18-13-00-20-126.png

I found that I can get other users' region information which is not expected.
 
For example i create a table as sysadmin, then I can read the region 
information as user1.
!image-2020-12-18-13-00-20-126.png!
 
I have found that list_regions is introduced by 
https://issues.apache.org/jira/browse/HBASE-14925



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25332) one NPE

2020-12-11 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie resolved HBASE-25332.
---
Resolution: Fixed

> one NPE
> ---
>
> Key: HBASE-25332
> URL: https://issues.apache.org/jira/browse/HBASE-25332
> Project: HBase
>  Issue Type: Bug
>  Components: Zookeeper
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.2.7, 2.3.4
>
>
> * getData can return null at 
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L615]
>  or
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L619]
>  all its caller have null checker except at
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-server/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java#L467]
> We shoud add null check for pontential NPEs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-25332) one NPE

2020-12-11 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie reopened HBASE-25332:
---

> one NPE
> ---
>
> Key: HBASE-25332
> URL: https://issues.apache.org/jira/browse/HBASE-25332
> Project: HBase
>  Issue Type: Bug
>  Components: Zookeeper
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.2.7, 2.3.4
>
>
> * getData can return null at 
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L615]
>  or
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L619]
>  all its caller have null checker except at
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-server/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java#L467]
> We shoud add null check for pontential NPEs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25332) one NPE

2020-12-11 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie resolved HBASE-25332.
---
Resolution: Fixed

> one NPE
> ---
>
> Key: HBASE-25332
> URL: https://issues.apache.org/jira/browse/HBASE-25332
> Project: HBase
>  Issue Type: Bug
>  Components: Zookeeper
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.2.7, 2.3.4
>
>
> * getData can return null at 
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L615]
>  or
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L619]
>  all its caller have null checker except at
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-server/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java#L467]
> We shoud add null check for pontential NPEs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-25332) one NPE

2020-12-11 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie reopened HBASE-25332:
---

> one NPE
> ---
>
> Key: HBASE-25332
> URL: https://issues.apache.org/jira/browse/HBASE-25332
> Project: HBase
>  Issue Type: Bug
>  Components: Zookeeper
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.2.7, 2.3.4
>
>
> * getData can return null at 
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L615]
>  or
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L619]
>  all its caller have null checker except at
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-server/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java#L467]
> We shoud add null check for pontential NPEs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25332) one NPE

2020-12-11 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie resolved HBASE-25332.
---
Resolution: Fixed

> one NPE
> ---
>
> Key: HBASE-25332
> URL: https://issues.apache.org/jira/browse/HBASE-25332
> Project: HBase
>  Issue Type: Bug
>  Components: Zookeeper
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.2.7, 2.3.4
>
>
> * getData can return null at 
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L615]
>  or
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L619]
>  all its caller have null checker except at
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-server/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java#L467]
> We shoud add null check for pontential NPEs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-25332) one NPE

2020-12-11 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie reopened HBASE-25332:
---

> one NPE
> ---
>
> Key: HBASE-25332
> URL: https://issues.apache.org/jira/browse/HBASE-25332
> Project: HBase
>  Issue Type: Bug
>  Components: Zookeeper
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.2.7, 2.3.4
>
>
> * getData can return null at 
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L615]
>  or
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-zookeeper/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java#L619]
>  all its caller have null checker except at
>  
> [https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-server/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java#L467]
> We shoud add null check for pontential NPEs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25332) One pontential NPE

2020-11-25 Thread lujie (Jira)
lujie created HBASE-25332:
-

 Summary: One pontential NPE
 Key: HBASE-25332
 URL: https://issues.apache.org/jira/browse/HBASE-25332
 Project: HBase
  Issue Type: Bug
Reporter: lujie


peek can return null at 

[https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java#L108]

 

all its callers have null checker except at 

[https://github.com/apache/hbase/blob/1726160839368df14602da1618e3538955b25f74/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ReversedKeyValueHeap.java#L110]

We shoud add null check for pontential NPE



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25023) NPE while shutdown master node

2020-09-27 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie resolved HBASE-25023.
---
Fix Version/s: 2.2.6
   Resolution: Fixed

> NPE while shutdown master node
> --
>
> Key: HBASE-25023
> URL: https://issues.apache.org/jira/browse/HBASE-25023
> Project: HBase
>  Issue Type: Bug
>Reporter: lujie
>Assignee: Junhong Xu
>Priority: Major
> Fix For: 2.2.6
>
>
> while shutdown the master node, we can see the exception:
> {code:java}
> 2020-09-14 06:48:29,530 ERROR [PEWorker-16] procedure2.ProcedureExecutor: 
> CODE-BUG: Uncaught runtime exception: pid=111, ppid=64, state=RUNNABLE, 
> locked=true; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:276)
> at 
> org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:58)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:962)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1648)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1395)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:78)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1965)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25023) NPE while shutdown master node

2020-09-14 Thread lujie (Jira)
lujie created HBASE-25023:
-

 Summary: NPE while shutdown master node
 Key: HBASE-25023
 URL: https://issues.apache.org/jira/browse/HBASE-25023
 Project: HBase
  Issue Type: Bug
Reporter: lujie


while shutdown the master node, we can see the exception:
{code:java}
2020-09-14 06:48:29,530 ERROR [PEWorker-16] procedure2.ProcedureExecutor: 
CODE-BUG: Uncaught runtime exception: pid=111, ppid=64, state=RUNNABLE, 
locked=true; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:276)
at 
org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:58)
at 
org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:962)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1648)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1395)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:78)
at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1965)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24976) REST Server failes to start without any error message

2020-09-02 Thread lujie (Jira)
lujie created HBASE-24976:
-

 Summary: REST Server failes to start without any error message
 Key: HBASE-24976
 URL: https://issues.apache.org/jira/browse/HBASE-24976
 Project: HBase
  Issue Type: Bug
  Components: REST
Affects Versions: 2.2.1
Reporter: lujie






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-22050) NPE happens while RS shutdown, due to atomic violation

2019-03-13 Thread lujie (JIRA)
lujie created HBASE-22050:
-

 Summary: NPE happens while RS shutdown, due to atomic violation
 Key: HBASE-22050
 URL: https://issues.apache.org/jira/browse/HBASE-22050
 Project: HBase
  Issue Type: Bug
Reporter: lujie


while RS shutdown, the RS#abort are called due to
{code:java}
handler.AssignRegionHandler: Fatal error occured while opening region 
hbase:meta,,1.1588230740, aborting...
{code}
And in abort:
{code:java}
2428.if (rssStub != null && this.serverName != null) {
2429   ReportRSFatalErrorRequest.Builder builder =
2430.  ReportRSFatalErrorRequest.newBuilder();
2431.  builder.setServer(ProtobufUtil.toServerName(this.serverName));
2432   builder.setErrorMessage(msg);
2433   rssStub.reportRSFatalError(null, builder.build());
2434 }
{code}
2428-2434 are assumed to be atomic, but if it step in the 2429-2433, meanwhile 
RS#run:
{code:java}
1149 // Make sure the proxy is down.
1150 if (this.rssStub != null) {
1151this.rssStub = null;
1152 }
{code}
So the rssStub == null and NPE happens
{code:java}
2019-03-14 04:49:53,016 WARN [RS_CLOSE_META-regionserver/hadoop12:16020-0] 
regionserver.HRegionServer: Unable to report fatal error to master
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:2433)
at 
org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.handleException(AssignRegionHandler.java:154)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:106)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

{code}
I think we should avoid the NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22041) Master stuck in startup and print "FailedServerException" forever

2019-03-12 Thread lujie (JIRA)
lujie created HBASE-22041:
-

 Summary: Master stuck in startup and print "FailedServerException" 
forever
 Key: HBASE-22041
 URL: https://issues.apache.org/jira/browse/HBASE-22041
 Project: HBase
  Issue Type: Bug
Reporter: lujie
 Attachments: fixedlogs.zip

while master fresh boot, we  shutdown the RS who hold meta. we find that the 
master startup fails and print  thounds of logs like:
{code:java}
2019-03-13 01:09:54,896 WARN [RSProcedureDispatcher-pool4-t1] 
procedure.RSProcedureDispatcher: request to server hadoop14,16020,1552410583724 
failed due to java.net.ConnectException: Call to hadoop14/172.16.1.131:16020 
failed on connection exception: 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 syscall:getsockopt(..) failed: Connection refused: 
hadoop14/172.16.1.131:16020, try=0, retrying...
2019-03-13 01:09:55,004 WARN [RSProcedureDispatcher-pool4-t2] 
procedure.RSProcedureDispatcher: request to server hadoop14,16020,1552410583724 
failed due to org.apache.hadoop.hbase.ipc.FailedServerException: Call to 
hadoop14/172.16.1.131:16020 failed on local exception: 
org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed 
servers list: hadoop14/172.16.1.131:16020, try=1, retrying...
2019-03-13 01:09:55,114 WARN [RSProcedureDispatcher-pool4-t3] 
procedure.RSProcedureDispatcher: request to server hadoop14,16020,1552410583724 
failed due to org.apache.hadoop.hbase.ipc.FailedServerException: Call to 
hadoop14/172.16.1.131:16020 failed on local exception: 
org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed 
servers list: hadoop14/172.16.1.131:16020, try=2, retrying...
2019-03-13 01:09:55,219 WARN [RSProcedureDispatcher-pool4-t4] 
procedure.RSProcedureDispatcher: request to server hadoop14,16020,1552410583724 
failed due to org.apache.hadoop.hbase.ipc.FailedServerException: Call to 
hadoop14/172.16.1.131:16020 failed on local exception: 
org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed 
servers list: hadoop14/172.16.1.131:16020, try=3, retrying...
2019-03-13 01:09:55,324 WARN [RSProcedureDispatcher-pool4-t5] 
procedure.RSProcedureDispatcher: request to server hadoop14,16020,1552410583724 
failed due to org.apache.hadoop.hbase.ipc.FailedServerException: Call to 
hadoop14/172.16.1.131:16020 failed on local exception: 
org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed 
servers list: hadoop14/172.16.1.131:16020, try=4, retrying...
2019-03-13 01:09:55,428 WARN [RSProcedureDispatcher-pool4-t6] 
procedure.RSProcedureDispatcher: request to server hadoop14,16020,1552410583724 
failed due to org.apache.hadoop.hbase.ipc.FailedServerException: Call to 
hadoop14/172.16.1.131:16020 failed on local exception: 
org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed 
servers list: hadoop14/172.16.1.131:16020, try=5, retrying...
2019-03-13 01:09:55,533 WARN [RSProcedureDispatcher-pool4-t7] 
procedure.RSProcedureDispatcher: request to server hadoop14,16020,1552410583724 
failed due to org.apache.hadoop.hbase.ipc.FailedServerException: Call to 
hadoop14/172.16.1.131:16020 failed on local exception: 
org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed 
servers list: hadoop14/172.16.1.131:16020, try=6, retrying...
2019-03-13 01:09:55,638 WARN [RSProcedureDispatcher-pool4-t8] 
procedure.RSProcedureDispatcher: request to server hadoop14,16020,1552410583724 
failed due to org.apache.hadoop.hbase.ipc.FailedServerException: Call to 
hadoop14/172.16.1.131:16020 failed on local exception: 
org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed 
servers list: hadoop14/172.16.1.131:16020, try=7, retrying...
2019-03-13 01:09:55,755 WARN [RSProcedureDispatcher-pool4-t9] 
procedure.RSProcedureDispatcher: request to server hadoop14,16020,1552410583724 
failed due to org.apache.hadoop.hbase.ipc.FailedServerException: Call to 
hadoop14/172.16.1.131:16020 failed on local exception: 
org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed 
servers list: hadoop14/172.16.1.131:16020, try=8, retrying...

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22023) similar to HBASE-21740: NPE happens while shutdown the RS

2019-03-10 Thread lujie (JIRA)
lujie created HBASE-22023:
-

 Summary: similar to HBASE-21740: NPE happens while shutdown the RS
 Key: HBASE-22023
 URL: https://issues.apache.org/jira/browse/HBASE-22023
 Project: HBase
  Issue Type: Bug
Reporter: lujie
Assignee: lujie


shutdown command comes before startServices:
{code:java}
if (!isStopped() && !isAborted()) {
  initializeThreads();
}{code}
so initializeThreads will skip and leases is null

leases will be used in line 1996 without check, hence NPE happens

Give the simple fix!
{code:java}
2019-03-10 14:17:12,690 ERROR [regionserver/hadoop15:16020] 
regionserver.HRegionServer: Failed init
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startServices(HRegionServer.java:1996)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1575)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:976)
at java.lang.Thread.run(Thread.java:745)
2019-03-10 14:17:12,719 ERROR [regionserver/hadoop15:16020] 
regionserver.HRegionServer: * ABORTING region server 
hadoop15,16020,1552198622594: Unhandled: Region server startup failed *
java.io.IOException: Region server startup failed
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:3398)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1594)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:976)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startServices(HRegionServer.java:1996)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1575)
... 2 more
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22017) Failed to become active master due to lease 'XXX' does not exist

2019-03-08 Thread lujie (JIRA)
lujie created HBASE-22017:
-

 Summary: Failed to become active master due to lease 'XXX' does 
not exist
 Key: HBASE-22017
 URL: https://issues.apache.org/jira/browse/HBASE-22017
 Project: HBase
  Issue Type: Bug
Reporter: lujie


{code:java}
2019-03-06 01:36:17,040 ERROR [master/hadoop11:16000:becomeActiveMaster] 
master.HMaster: * ABORTING master hadoop11,16000,1551807353275: Unhandled 
exception. Starting shutdown. *
org.apache.hadoop.hbase.regionserver.LeaseException: 
org.apache.hadoop.hbase.regionserver.LeaseException: lease 
'3449673378019934209' does not exist
at org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:224)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3434)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42002)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
at 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
at 
org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:361)
at 
org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:349)
at 
org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:344)
at 
org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:242)
at 
org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:58)
at 
org.apache.hadoop.hbase.client.RegionServerCallable.call(RegionServerCallable.java:127)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:387)
at 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:361)
at 
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
at 
org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21740) NPE happens while shutdown the RS

2019-01-18 Thread lujie (JIRA)
lujie created HBASE-21740:
-

 Summary: NPE happens while shutdown the RS
 Key: HBASE-21740
 URL: https://issues.apache.org/jira/browse/HBASE-21740
 Project: HBase
  Issue Type: Bug
Reporter: lujie


while shutdown a NM, we meet the NPE:
{code:java}
2019-01-18 16:52:05,500 INFO [Thread-4] regionserver.HRegionServer: STOPPED: 
Shutdown hook
2019-01-18 16:52:05,896 INFO [regionserver/hadoop15:16020] 
regionserver.MetricsRegionServerWrapperImpl: Computing regionserver metrics 
every 5000 milliseconds
2019-01-18 16:52:05,978 INFO [regionserver/hadoop15:16020.Chore.1] 
hbase.ScheduledChore: Chore: CompactedHFilesCleaner was stopped
2019-01-18 16:52:05,996 ERROR [regionserver/hadoop15:16020] 
regionserver.HRegionServer: Failed init
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startServices(HRegionServer.java:1978)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1572)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:975)
at java.lang.Thread.run(Thread.java:745)
2019-01-18 16:52:06,011 ERROR [regionserver/hadoop15:16020] 
regionserver.HRegionServer: * ABORTING region server 
hadoop15,16020,1547801516426: Unhandled: Region server startup failed *
java.io.IOException: Region server startup failed
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:3392)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1591)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:975)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startServices(HRegionServer.java:1978)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1572)
... 2 more

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20420) Fix Some Potential NPE

2018-04-15 Thread lujie (JIRA)
lujie created HBASE-20420:
-

 Summary: Fix Some Potential NPE 
 Key: HBASE-20420
 URL: https://issues.apache.org/jira/browse/HBASE-20420
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0-beta-2
Reporter: lujie
 Attachments: hbase-20420.patch

We have used the  tool [NPEDetector|https://github.com/lujiefsi/NPEDetector] 
find another  six problems that similar to  HBASE-20419.

list here and attach the patch.

 CommonFSUtils#listStatus

RSGroupInfoManagerImpl#getRSGroupOfServer

BackupSystemTable#readBackupInfo

SnapshotManifest#getRegionManifestsMap

HRegionFileSystem#getFamilies

Result#getFamilyMap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20419) Two Potential NPE

2018-04-15 Thread lujie (JIRA)
lujie created HBASE-20419:
-

 Summary: Two Potential NPE 
 Key: HBASE-20419
 URL: https://issues.apache.org/jira/browse/HBASE-20419
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0-beta-2
Reporter: lujie
 Attachments: HBASE-20419_1.patch

Callee ZKUtil#listChildrenAndWatchForNewChildren may return null, it has 8 
callers, 6 of the caller have null checker like:
{code:java}
List children = ZKUtil.listChildrenAndWatchForNewChildren(zkw, 
zkw.znodePaths.rsZNode);
if (children == null) {
return Collections.emptyList();
}
{code}
but another two callers do not have null 
checker:RSGroupInfoManagerImpl#retrieveGroupListFromZookeeper,ZKProcedureMemberRpcs#watchForAbortedProcedures.
 

We attach the patch to fix this probelm.(We found this bug by  tool 
[NPEDetector|https://github.com/lujiefsi/NPEDetector])



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19004) master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected

2017-10-13 Thread lujie (JIRA)
lujie created HBASE-19004:
-

 Summary: master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
 Key: HBASE-19004
 URL: https://issues.apache.org/jira/browse/HBASE-19004
 Project: HBase
  Issue Type: Bug
Reporter: lujie


When send stop regionserver command 

{code:java}
2017-10-13 16:28:28,366 INFO  [ProcedureExecutor-1] 
zookeeper.ZKTableStateManager: Moving table TestTable state from null to 
ENABLING
2017-10-13 16:28:28,387 INFO  [ProcedureExecutor-1] master.AssignmentManager: 
Bulk assigning 1 region(s) across 3 server(s), round-robin=true
2017-10-13 16:28:28,388 INFO  
[hadoop11,16000,1507883241250-GeneralBulkAssigner-0] master.AssignmentManager: 
Assigning 1 region(s) to hadoop11,16020,1507883241942
2017-10-13 16:28:28,394 INFO  
[hadoop11,16000,1507883241250-GeneralBulkAssigner-0] master.RegionStates: 
Transition {2aaaf8304f2b09288f528ac0f105cc01 state=OFFLINE, ts=1507883308388, 
server=null} to {2aaaf8304f2b09288f528ac0f105cc01 state=PENDING_OPEN, 
ts=1507883308394, server=hadoop11,16020,1507883241942}
2017-10-13 16:28:28,585 INFO  [AM.ZK.Worker-pool2-t10] master.RegionStates: 
Transition {2aaaf8304f2b09288f528ac0f105cc01 state=PENDING_OPEN, 
ts=1507883308394, server=hadoop11,16020,1507883241942} to 
{2aaaf8304f2b09288f528ac0f105cc01 state=OPENING, ts=1507883308585, 
server=hadoop11,16020,1507883241942}
2017-10-13 16:28:29,163 INFO  [AM.ZK.Worker-pool2-t11] master.RegionStates: 
Transition {2aaaf8304f2b09288f528ac0f105cc01 state=OPENING, ts=1507883308585, 
server=hadoop11,16020,1507883241942} to {2aaaf8304f2b09288f528ac0f105cc01 
state=OPEN, ts=1507883309163, server=hadoop11,16020,1507883241942}
2017-10-13 16:28:36,517 INFO  [main-EventThread] zookeeper.RegionServerTracker: 
RegionServer ephemeral node deleted, processing expiration 
[hadoop11,16020,1507883241942]
2017-10-13 16:28:37,428 INFO  [ProcedureExecutor-2] 
procedure.ServerCrashProcedure: Start processing crashed 
hadoop11,16020,1507883241942
2017-10-13 16:28:37,689 INFO  [ProcedureExecutor-4] master.SplitLogManager: 
dead splitlog workers [hadoop11,16020,1507883241942]
2017-10-13 16:28:37,693 INFO  [ProcedureExecutor-4] master.SplitLogManager: 
hdfs://hadoop11:29000/hbase/WALs/hadoop11,16020,1507883241942-splitting is 
empty dir, no logs to split
2017-10-13 16:28:37,695 INFO  [ProcedureExecutor-4] master.SplitLogManager: 
Started splitting 0 logs in 
[hdfs://hadoop11:29000/hbase/WALs/hadoop11,16020,1507883241942-splitting] for 
[hadoop11,16020,1507883241942]
2017-10-13 16:28:37,701 INFO  [ProcedureExecutor-4] master.SplitLogManager: 
finished splitting (more than or equal to) 0 bytes in 0 log files in 
[hdfs://hadoop11:29000/hbase/WALs/hadoop11,16020,1507883241942-splitting] in 6ms
2017-10-13 16:28:37,807 WARN  [ProcedureExecutor-4] master.RegionStates: THIS 
SHOULD NOT HAPPEN: unexpected {2aaaf8304f2b09288f528ac0f105cc01 state=OPEN, 
ts=1507883309163, server=hadoop11,16020,1507883241942}
2017-10-13 16:28:37,923 INFO  [ProcedureExecutor-4] 
procedure.ServerCrashProcedure: Finished processing of crashed 
hadoop11,16020,1507883241942
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)