[ 
https://issues.apache.org/jira/browse/HDFS-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923007#comment-16923007
 ] 

Ayush Saxena commented on HDFS-14811:
-------------------------------------

Not the correct solution specifically, but the correct work around. Solution 
shall be if we get to know why this is happening in the test and eliminate the 
root cause for it.

> RBF: TestRouterRpc#testErasureCoding is flaky
> ---------------------------------------------
>
>                 Key: HDFS-14811
>                 URL: https://issues.apache.org/jira/browse/HDFS-14811
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Chen Zhang
>            Assignee: Chen Zhang
>            Priority: Major
>         Attachments: HDFS-14811.001.patch
>
>
> The Failed reason:
> {code:java}
> 2019-09-01 18:19:20,940 [IPC Server handler 5 on default port 53140] INFO  
> blockmanagement.BlockPlacementPolicy 
> (BlockPlacementPolicyDefault.java:chooseRandom(838)) - [
> Node /default-rack/127.0.0.1:53148 [
> ]
> Node /default-rack/127.0.0.1:53161 [
> ]
> Node /default-rack/127.0.0.1:53157 [
>   Datanode 127.0.0.1:53157 is not chosen since the node is too busy (load: 3 
> > 2.6666666666666665).
> Node /default-rack/127.0.0.1:53143 [
> ]
> Node /default-rack/127.0.0.1:53165 [
> ]
> 2019-09-01 18:19:20,940 [IPC Server handler 5 on default port 53140] INFO  
> blockmanagement.BlockPlacementPolicy 
> (BlockPlacementPolicyDefault.java:chooseRandom(846)) - Not enough replicas 
> was chosen. Reason: {NODE_TOO_BUSY=1}
> 2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] WARN  
> blockmanagement.BlockPlacementPolicy 
> (BlockPlacementPolicyDefault.java:chooseTarget(449)) - Failed to place enough 
> replicas, still in need of 1 to reach 6 (unavailableStorages=[], 
> storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) 
> 2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] WARN  
> protocol.BlockStoragePolicy (BlockStoragePolicy.java:chooseStorageTypes(161)) 
> - Failed to place enough replicas: expected size is 1 but only 0 storage 
> types can be selected (replication=6, selected=[], unavailable=[DISK], 
> removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
> creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
> 2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] WARN  
> blockmanagement.BlockPlacementPolicy 
> (BlockPlacementPolicyDefault.java:chooseTarget(449)) - Failed to place enough 
> replicas, still in need of 1 to reach 6 (unavailableStorages=[DISK], 
> storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All 
> required storage types are unavailable:  unavailableStorages=[DISK], 
> storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> 2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] INFO  
> ipc.Server (Server.java:logException(2982)) - IPC Server handler 5 on default 
> port 53140, call Call#1270 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 127.0.0.1:53202
> java.io.IOException: File /testec/testfile2 could only be written to 5 of the 
> 6 required nodes for RS-6-3-1024k. There are 6 datanode(s) running and 6 
> node(s) are excluded in this operation.
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2222)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2815)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:893)
>       at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
>       at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
>       at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1001)
>       at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:929)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2921)
> 2019-09-01 18:19:20,942 [IPC Server handler 6 on default port 53197] INFO  
> ipc.Server (Server.java:logException(2975)) - IPC Server handler 6 on default 
> port 53197, call Call#1268 Retry#0 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
> 192.168.1.112:53201: java.io.IOException: File /testec/testfile2 could only 
> be written to 5 of the 6 required nodes for RS-6-3-1024k. There are 6 
> datanode(s) running and 6 node(s) are excluded in this operation.
> {code}
> More discussion, see: 
> [HDFS-14654|https://issues.apache.org/jira/browse/HDFS-14654?focusedCommentId=16920439&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16920439]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to