[ https://issues.apache.org/jira/browse/HDFS-13309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Elek, Marton updated HDFS-13309: -------------------------------- Status: Patch Available (was: Open) > Ozone: Improve error message in case of missing nodes > ----------------------------------------------------- > > Key: HDFS-13309 > URL: https://issues.apache.org/jira/browse/HDFS-13309 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: HDFS-7240 > Affects Versions: HDFS-7240 > Reporter: Elek, Marton > Assignee: Elek, Marton > Priority: Minor > Attachments: HDFS-13309-HDFS-7240.001.patch, > HDFS-13309-HDFS-7240.002.patch > > > During testing ozonefs with spark I found multiple error messages in the log: > {code} > scm_1 | java.lang.NullPointerException > scm_1 | at > org.apache.hadoop.ozone.scm.container.ContainerStates.ContainerStateMap.addContainer(ContainerStateMap.java:129) > scm_1 | at > org.apache.hadoop.ozone.scm.container.ContainerStateManager.allocateContainer(ContainerStateManager.java:308) > scm_1 | at > org.apache.hadoop.ozone.scm.container.ContainerMapping.allocateContainer(ContainerMapping.java:244) > scm_1 | at > org.apache.hadoop.ozone.scm.block.BlockManagerImpl.preAllocateContainers(BlockManagerImpl.java:189) > scm_1 | at > org.apache.hadoop.ozone.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:291) > scm_1 | at > org.apache.hadoop.ozone.scm.StorageContainerManager.allocateBlock(StorageContainerManager.java:1131) > scm_1 | at > org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:109) > scm_1 | at > org.apache.hadoop.hdsl.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:8038) > scm_1 | at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > scm_1 | at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007) > scm_1 | at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873) > scm_1 | at > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819) > scm_1 | at java.security.AccessController.doPrivileged(Native > Method) > scm_1 | at javax.security.auth.Subject.doAs(Subject.java:422) > scm_1 | at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > scm_1 | at > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679) > {code} > The problem is that PiplineManager..getPipeline() may return with null if > pipline couldn't be found/establised (for example if I have not enogh nodes > for a ratis ring). > In ContainerStateMap.addContainer this pipline is expected to be not null. > I suggest to do an additional check in > ContainerStateManager.allocateContainer and return with more meaningfull > error message. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org