[ https://issues.apache.org/jira/browse/HDFS-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Abhiraj Butala updated HDFS-6378: --------------------------------- Attachment: HDFS-6378.patch Attaching a simple patch to add a timeout to DatagramSocket which otherwise blocks indefinitely on receive(). I have kept the timeout to be 500ms, let me know if it should be changed to something more appropriate. Ctrl-C is now able to kill NFS gateway if portmap is not running or is exited. Note that, an exception is logged when portmap is not running, but NFS gateway does not exit until Ctrl-C is pressed. Output logs: {code} 14/06/29 03:11:46 INFO oncrpc.SimpleUdpServer: Started listening to UDP requests at port 4242 for Rpc program: mountd at localhost:4242 with workerCount 1 14/06/29 03:11:46 INFO oncrpc.SimpleTcpServer: Started listening to TCP requests at port 4242 for Rpc program: mountd at localhost:4242 with workerCount 1 14/06/29 03:11:46 ERROR oncrpc.RpcProgram: Registration failure with localhost:4242, portmap entry: (PortmapMapping-100005:1:17:4242) java.net.SocketTimeoutException: Receive timed out at java.net.PlainDatagramSocketImpl.receive0(Native Method) at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145) at java.net.DatagramSocket.receive(DatagramSocket.java:786) at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66) at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130) at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:101) at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:77) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:55) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:68) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:72) Exception in thread "main" java.lang.RuntimeException: Registration failure at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:135) at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:101) at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:77) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:55) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:68) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:72) Caused by: java.net.SocketTimeoutException: Receive timed out at java.net.PlainDatagramSocketImpl.receive0(Native Method) at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145) at java.net.DatagramSocket.receive(DatagramSocket.java:786) at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66) at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130) ... 5 more ^C14/06/29 03:18:51 ERROR nfs3.Nfs3Base: RECEIVED SIGNAL 2: SIGINT 14/06/29 03:18:52 ERROR oncrpc.RpcProgram: Unregistration failure with localhost:4242, portmap entry: (PortmapMapping-100005:1:17:4242) java.net.SocketTimeoutException: Receive timed out at java.net.PlainDatagramSocketImpl.receive0(Native Method) at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145) at java.net.DatagramSocket.receive(DatagramSocket.java:786) at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66) at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130) at org.apache.hadoop.oncrpc.RpcProgram.unregister(RpcProgram.java:118) at org.apache.hadoop.mount.MountdBase$Unregister.run(MountdBase.java:90) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) 14/06/29 03:18:52 WARN util.ShutdownHookManager: ShutdownHook 'Unregister' failed, java.lang.RuntimeException: Unregistration failure java.lang.RuntimeException: Unregistration failure at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:135) at org.apache.hadoop.oncrpc.RpcProgram.unregister(RpcProgram.java:118) at org.apache.hadoop.mount.MountdBase$Unregister.run(MountdBase.java:90) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) Caused by: java.net.SocketTimeoutException: Receive timed out at java.net.PlainDatagramSocketImpl.receive0(Native Method) at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145) at java.net.DatagramSocket.receive(DatagramSocket.java:786) at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66) at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130) ... 3 more 14/06/29 03:18:52 INFO nfs3.Nfs3Base: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down Nfs3 at abutala-vBox/127.0.1.1 ************************************************************/ {code} > NFS: when portmap/rpcbind is not available, NFS registration should timeout > instead of hanging > ----------------------------------------------------------------------------------------------- > > Key: HDFS-6378 > URL: https://issues.apache.org/jira/browse/HDFS-6378 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs > Reporter: Brandon Li > Attachments: HDFS-6378.patch > > > When portmap/rpcbind is not available, NFS could be stuck at registration. > Instead, NFS gateway should shut down automatically with proper error message. -- This message was sent by Atlassian JIRA (v6.2#6252)