[ 
https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002571#comment-17002571
 ] 

Fei Hui commented on HDFS-15078:
--------------------------------

[~ayushtkn][~elgoiri] I find  there is a critical problem on RBF, this issue 
can resolve it on some Scenarios, but i have no idea about the overall 
resolution. Plan to file a new jira to track it.
The problem is  that
# Client with RBF(r0, r1) create a file HDFS file via r0, it gets Exception and 
failovers to r1
# r0 has been send create rpc to namenode(1st create)
# Client create a HDFS file via r1(2nd create)
# Client writes the HDFS file and close it finally(3rd close)

Maybe namenode receiving the rpc in order as follow
# 2nd create
# 3rd close
# 1st create

And overwrite is true by default, this would make the file had been written an 
empty file. This is an critical problem and we had encountered it

> RBF: Should check connection channel before sending rpc to namenode
> -------------------------------------------------------------------
>
>                 Key: HDFS-15078
>                 URL: https://issues.apache.org/jira/browse/HDFS-15078
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: rbf
>    Affects Versions: 3.3.0
>            Reporter: Fei Hui
>            Assignee: Fei Hui
>            Priority: Major
>         Attachments: HDFS-15078.001.patch
>
>
> dfsrouter logs show that
> {quote}
> 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 6400 on 8888, call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
> 10.83.164.11:56908 Call#2 Retry#0: output error
> 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 125 on 8888 caught an exception
> java.nio.channels.ClosedChannelException
>         at 
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
>         at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731)
>         at org.apache.hadoop.ipc.Server.access$2100(Server.java:134)
>         at 
> org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089)
>         at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161)
>         at 
> org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109)
>         at 
> org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229)
>         at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245)
> {quote}
> Maybe checking connection between client and router is better before 
> sendingrpc to namenode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to