[ 
https://issues.apache.org/jira/browse/HDFS-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhiraj Butala updated HDFS-6378:
---------------------------------

    Attachment: HDFS-6378.patch

Attaching a simple patch to add a timeout to DatagramSocket which otherwise 
blocks indefinitely on receive(). I have kept the timeout to be 500ms, let me 
know if it should be changed to something more appropriate. 

Ctrl-C is now able to kill NFS gateway if portmap is not running or is exited. 
Note that, an exception is logged when portmap is not running, but NFS gateway 
does not exit until Ctrl-C is pressed.

Output logs:
{code}
14/06/29 03:11:46 INFO oncrpc.SimpleUdpServer: Started listening to UDP 
requests at port 4242 for Rpc program: mountd at localhost:4242 with 
workerCount 1
14/06/29 03:11:46 INFO oncrpc.SimpleTcpServer: Started listening to TCP 
requests at port 4242 for Rpc program: mountd at localhost:4242 with 
workerCount 1
14/06/29 03:11:46 ERROR oncrpc.RpcProgram: Registration failure with 
localhost:4242, portmap entry: (PortmapMapping-100005:1:17:4242)
java.net.SocketTimeoutException: Receive timed out
        at java.net.PlainDatagramSocketImpl.receive0(Native Method)
        at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145)
        at java.net.DatagramSocket.receive(DatagramSocket.java:786)
        at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66)
        at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130)
        at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:101)
        at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:77)
        at 
org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:55)
        at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:68)
        at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:72)
Exception in thread "main" java.lang.RuntimeException: Registration failure
        at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:135)
        at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:101)
        at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:77)
        at 
org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:55)
        at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:68)
        at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:72)
Caused by: java.net.SocketTimeoutException: Receive timed out
        at java.net.PlainDatagramSocketImpl.receive0(Native Method)
        at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145)
        at java.net.DatagramSocket.receive(DatagramSocket.java:786)
        at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66)
        at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130)
        ... 5 more
^C14/06/29 03:18:51 ERROR nfs3.Nfs3Base: RECEIVED SIGNAL 2: SIGINT
14/06/29 03:18:52 ERROR oncrpc.RpcProgram: Unregistration failure with 
localhost:4242, portmap entry: (PortmapMapping-100005:1:17:4242)
java.net.SocketTimeoutException: Receive timed out
        at java.net.PlainDatagramSocketImpl.receive0(Native Method)
        at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145)
        at java.net.DatagramSocket.receive(DatagramSocket.java:786)
        at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66)
        at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130)
        at org.apache.hadoop.oncrpc.RpcProgram.unregister(RpcProgram.java:118)
        at org.apache.hadoop.mount.MountdBase$Unregister.run(MountdBase.java:90)
        at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
14/06/29 03:18:52 WARN util.ShutdownHookManager: ShutdownHook 'Unregister' 
failed, java.lang.RuntimeException: Unregistration failure
java.lang.RuntimeException: Unregistration failure
        at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:135)
        at org.apache.hadoop.oncrpc.RpcProgram.unregister(RpcProgram.java:118)
        at org.apache.hadoop.mount.MountdBase$Unregister.run(MountdBase.java:90)
        at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
Caused by: java.net.SocketTimeoutException: Receive timed out
        at java.net.PlainDatagramSocketImpl.receive0(Native Method)
        at 
java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145)
        at java.net.DatagramSocket.receive(DatagramSocket.java:786)
        at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66)
        at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130)
        ... 3 more
14/06/29 03:18:52 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down Nfs3 at abutala-vBox/127.0.1.1
************************************************************/
{code}

> NFS: when portmap/rpcbind is not available, NFS registration should timeout 
> instead of hanging 
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6378
>                 URL: https://issues.apache.org/jira/browse/HDFS-6378
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: nfs
>            Reporter: Brandon Li
>         Attachments: HDFS-6378.patch
>
>
> When portmap/rpcbind is not available, NFS could be stuck at registration. 
> Instead, NFS gateway should shut down automatically with proper error message.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to