[jira] [Updated] (HDFS-16097) Datanode receives ipc requests will throw NPE when datanode quickly restart

2021-07-07 Thread lei w (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lei w updated HDFS-16097:
-
Description: 
Datanode receives ipc requests will throw NPE when datanode quickly restart. 
This is because when DN is reStarted, BlockPool is first registered with 
blockPoolManager and then fsdataset is initialized. When BlockPool is 
registered to blockPoolManager without initializing fsdataset,  DataNode 
receives an IPC request will throw NPE, because it will call related methods 
provided by fsdataset. The stack exception is as follows:



{code:java}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initReplicaRecovery(DataNode.java:3468)
at 
org.apache.hadoop.hdfs.protocolPB.InterDatanodeProtocolServerSideTranslatorPB.initReplicaRecovery(InterDatanodeProtocolServerSideTranslatorPB.java:55)
at 
org.apache.hadoop.hdfs.protocol.proto.InterDatanodeProtocolProtos$InterDatanodeProtocolService$2.callBlockingMethod(InterDatanodeProtocolProtos.java:3105)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:916)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:862)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
{code}


The  client side stack exception  is as follows:

{code:java}
 WARN org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed to 
recover block (block=BP-###:blk_###, 
datanode=DatanodeInfoWithStorage[,null,null])
org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initReplicaRecovery(DataNode.java:3468)
at 
org.apache.hadoop.hdfs.protocolPB.InterDatanodeProtocolServerSideTranslatorPB.initReplicaRecovery(InterDatanodeProtocolServerSideTranslatorPB.java:55)
at 
org.apache.hadoop.hdfs.protocol.proto.InterDatanodeProtocolProtos$InterDatanodeProtocolService$2.callBlockingMethod(InterDatanodeProtocolProtos.java:3105)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:916)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:862)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2873)

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
at org.apache.hadoop.ipc.Client.call(Client.java:1457)
at org.apache.hadoop.ipc.Client.call(Client.java:1367)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy26.initReplicaRecovery(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.InterDatanodeProtocolTranslatorPB.initReplicaRecovery(InterDatanodeProtocolTranslatorPB.java:83)
at 
org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker.callInitReplicaRecovery(BlockRecoveryWorker.java:571)
at 
org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker.access$400(BlockRecoveryWorker.java:57)
at 
org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskContiguous.recover(BlockRecoveryWorker.java:142)
at 
org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:610)
at java.lang.Thread.run(Thread.java:748)
{code}



  was:
Datanode receives ipc requests will throw NPE when datanode quickly restart. 
This is because when DN is reStarted, BlockPool is first registered with 
blockPoolManager and then fsdataset is initialized. When BlockPool is 
registered to blockPoolManager without initializing fsdataset,  DataNode 
receives an IPC request will throw NPE, because it will call related methods 
provided by fsdataset. The stack exception is as follows:



{code:java}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initReplicaRecovery(DataNode.java:3468)
at 
org.apache.hadoop.hdfs.protocolPB.InterDatanodeProtocolServerSideTranslatorPB.initReplicaRecovery(InterDatanodeProtocolServerSideTranslatorPB.java:55)
at 
org.apache.hadoop.hdfs.protocol.proto.InterDatanodeProtocolProtos$InterDatanodeProtocolService$2.callBlockingMethod(InterDatanodeProtocolProtos.java:3105)
at 

[jira] [Updated] (HDFS-16097) Datanode receives ipc requests will throw NPE when datanode quickly restart

2021-06-30 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-16097:
---
Status: Patch Available  (was: Open)

> Datanode receives ipc requests will throw NPE when datanode quickly restart 
> 
>
> Key: HDFS-16097
> URL: https://issues.apache.org/jira/browse/HDFS-16097
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: 
>Reporter: lei w
>Assignee: lei w
>Priority: Major
> Attachments: HDFS-16097.001.patch
>
>
> Datanode receives ipc requests will throw NPE when datanode quickly restart. 
> This is because when DN is reStarted, BlockPool is first registered with 
> blockPoolManager and then fsdataset is initialized. When BlockPool is 
> registered to blockPoolManager without initializing fsdataset,  DataNode 
> receives an IPC request will throw NPE, because it will call related methods 
> provided by fsdataset. The stack exception is as follows:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initReplicaRecovery(DataNode.java:3468)
> at 
> org.apache.hadoop.hdfs.protocolPB.InterDatanodeProtocolServerSideTranslatorPB.initReplicaRecovery(InterDatanodeProtocolServerSideTranslatorPB.java:55)
> at 
> org.apache.hadoop.hdfs.protocol.proto.InterDatanodeProtocolProtos$InterDatanodeProtocolService$2.callBlockingMethod(InterDatanodeProtocolProtos.java:3105)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:916)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:862)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16097) Datanode receives ipc requests will throw NPE when datanode quickly restart

2021-06-29 Thread lei w (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lei w updated HDFS-16097:
-
Attachment: HDFS-16097.001.patch

> Datanode receives ipc requests will throw NPE when datanode quickly restart 
> 
>
> Key: HDFS-16097
> URL: https://issues.apache.org/jira/browse/HDFS-16097
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: 
>Reporter: lei w
>Priority: Major
> Attachments: HDFS-16097.001.patch
>
>
> Datanode receives ipc requests will throw NPE when datanode quickly restart. 
> This is because when DN is reStarted, BlockPool is first registered with 
> blockPoolManager and then fsdataset is initialized. When BlockPool is 
> registered to blockPoolManager without initializing fsdataset,  DataNode 
> receives an IPC request will throw NPE, because it will call related methods 
> provided by fsdataset. The stack exception is as follows:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initReplicaRecovery(DataNode.java:3468)
> at 
> org.apache.hadoop.hdfs.protocolPB.InterDatanodeProtocolServerSideTranslatorPB.initReplicaRecovery(InterDatanodeProtocolServerSideTranslatorPB.java:55)
> at 
> org.apache.hadoop.hdfs.protocol.proto.InterDatanodeProtocolProtos$InterDatanodeProtocolService$2.callBlockingMethod(InterDatanodeProtocolProtos.java:3105)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:916)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:862)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org