Elek, Marton created HDFS-13309: ----------------------------------- Summary: Ozone: Improve error message in case of missing nodes Key: HDFS-13309 URL: https://issues.apache.org/jira/browse/HDFS-13309 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS-7240 Affects Versions: HDFS-7240 Reporter: Elek, Marton Assignee: Elek, Marton
During testing ozonefs with spark I found multiple error messages in the log: {code} scm_1 | java.lang.NullPointerException scm_1 | at org.apache.hadoop.ozone.scm.container.ContainerStates.ContainerStateMap.addContainer(ContainerStateMap.java:129) scm_1 | at org.apache.hadoop.ozone.scm.container.ContainerStateManager.allocateContainer(ContainerStateManager.java:308) scm_1 | at org.apache.hadoop.ozone.scm.container.ContainerMapping.allocateContainer(ContainerMapping.java:244) scm_1 | at org.apache.hadoop.ozone.scm.block.BlockManagerImpl.preAllocateContainers(BlockManagerImpl.java:189) scm_1 | at org.apache.hadoop.ozone.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:291) scm_1 | at org.apache.hadoop.ozone.scm.StorageContainerManager.allocateBlock(StorageContainerManager.java:1131) scm_1 | at org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:109) scm_1 | at org.apache.hadoop.hdsl.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:8038) scm_1 | at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) scm_1 | at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007) scm_1 | at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873) scm_1 | at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819) scm_1 | at java.security.AccessController.doPrivileged(Native Method) scm_1 | at javax.security.auth.Subject.doAs(Subject.java:422) scm_1 | at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) scm_1 | at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679) {code} The problem is that PiplineManager..getPipeline() may return with null if pipline couldn't be found/establised (for example if I have not enogh nodes for a ratis ring). In ContainerStateMap.addContainer this pipline is expected to be not null. I suggest to do an additional check in ContainerStateManager.allocateContainer and return with more meaningfull error message. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org