[ 
https://issues.apache.org/jira/browse/HADOOP-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HADOOP-8191:
-----------------------------------

     Target Version/s: 0.23.3
    Affects Version/s:     (was: 0.24.0)
                       0.23.3

Moved to Common since the patch has no HDFS changes. I've also kicked Jenkins 
to run test-patch on the patch.
                
> SshFenceByTcpPort uses netcat incorrectly
> -----------------------------------------
>
>                 Key: HADOOP-8191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8191
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 0.23.3
>            Reporter: Philip Zeyliger
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3081.txt
>
>
> SshFencyByTcpPort currently assumes that the NN is listening on localhost.  
> Typical setups have the namenode listening just on the hostname of the 
> namenode, which would lead "nc -z" to not catch it.
> Here's an example in which the NN is running, listening on 8020, but doesn't 
> respond to "localhost 8020".
> {noformat}
> [root@xxx ~]# lsof -P -p 5286 | grep -i listen
> java    5286 root  110u  IPv4            1772357              TCP xxx:8020 
> (LISTEN)
> java    5286 root  121u  IPv4            1772397              TCP xxx:50070 
> (LISTEN)
> [root@xxx ~]# nc -z localhost 8020
> [root@xxx ~]# nc -z xxx 8020
> Connection to xxx 8020 port [tcp/intu-ec-svcdisc] succeeded!
> {noformat}
> Here's the likely offending code:
> {code}
>         LOG.info(
>             "Indeterminate response from trying to kill service. " +
>             "Verifying whether it is running using nc...");
>         rc = execCommand(session, "nc -z localhost 8020");
> {code}
> Naively, we could rely on netcat to the correct hostname (since the NN ought 
> to be listening on the hostname it's configured as), or just to use fuser.  
> Fuser catches ports independently of what IPs they're bound to:
> {noformat}
> [root@xxx ~]# fuser 1234/tcp
> 1234/tcp:             6766  6768
> [root@xxx ~]# jobs
> [1]-  Running                 nc -l localhost 1234 &
> [2]+  Running                 nc -l rhel56-18.ent.cloudera.com 1234 &
> [root@xxx ~]# sudo lsof -P | grep -i LISTEN | grep -i 1234
> nc         6766      root    3u     IPv4            2563626                 
> TCP localhost:1234 (LISTEN)
> nc         6768      root    3u     IPv4            2563671                 
> TCP xxx:1234 (LISTEN)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to