[ https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002571#comment-17002571 ]
Fei Hui commented on HDFS-15078: -------------------------------- [~ayushtkn][~elgoiri] I find there is a critical problem on RBF, this issue can resolve it on some Scenarios, but i have no idea about the overall resolution. Plan to file a new jira to track it. The problem is that # Client with RBF(r0, r1) create a file HDFS file via r0, it gets Exception and failovers to r1 # r0 has been send create rpc to namenode(1st create) # Client create a HDFS file via r1(2nd create) # Client writes the HDFS file and close it finally(3rd close) Maybe namenode receiving the rpc in order as follow # 2nd create # 3rd close # 1st create And overwrite is true by default, this would make the file had been written an empty file. This is an critical problem and we had encountered it > RBF: Should check connection channel before sending rpc to namenode > ------------------------------------------------------------------- > > Key: HDFS-15078 > URL: https://issues.apache.org/jira/browse/HDFS-15078 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf > Affects Versions: 3.3.0 > Reporter: Fei Hui > Assignee: Fei Hui > Priority: Major > Attachments: HDFS-15078.001.patch > > > dfsrouter logs show that > {quote} > 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler > 6400 on 8888, call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from > 10.83.164.11:56908 Call#2 Retry#0: output error > 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 125 on 8888 caught an exception > java.nio.channels.ClosedChannelException > at > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461) > at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731) > at org.apache.hadoop.ipc.Server.access$2100(Server.java:134) > at > org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089) > at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161) > at > org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109) > at > org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229) > at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245) > {quote} > Maybe checking connection between client and router is better before > sendingrpc to namenode -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org