[
https://issues.apache.org/jira/browse/SLIDER-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lev Bronshtein updated SLIDER-1259:
-----------------------------------
Description:
In an an environment where Hadoop Worker nodes bind the Node Manager to an
interface with a hostname different from the one returned by socket.getfqdn()
for example in our test environment a difference between f-bcpc-vm3 and just
bcpc-vm3, which is the hostname bound to the management interface, but not the
interface for hadoop/production traffic. This results in our inability to
introspect running jobs.
For example running *slider registry --name slider_poc --listexp* results in
the following output in the ResourceManager logs
{quote}2018-01-26 17:30:32,147 INFO
org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: ubuntu is accessing
unchecked
[http://bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports] which
is the app master GUI of application_1516910361403_0094 owned by ubuntu
2018-01-26 17:31:13,639 WARN org.mortbay.log:
/proxy/application_1516910361403_0094/ws/v1/slider/publisher/exports:
java.net.ConnectException: Connection timed out (Connection timed out)
{quote}
Note how the redirect is to
[http://bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports,] where
as it should have been to
[http://f-bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports.]
Renaming the host to f-bcpc-vm3 results in appropriate behavior.
perhaps *hostname.py* can be instructed to look at one of before registering
*yarn.nodemanager.address*
*yarn.nodemanager.bind-host*
*yarn.nodemanager.hostname*
was:
In an an environment where Hadoop Worker nodes bind the Node Manager to an
interface with a hostname different from the one returned by socket.getfqdn()
for example in our test environment a difference between f-bcpc-vm3 and just
bcpc-vm3, which is the hostname bound to the management interface, but not the
interface for hadoop/production traffic. This results in our inability to
introspect running jobs.
For example running *slider registry --name slider_poc --listexp* results in
the following output in the ResourceManager logs
{quote}2018-01-26 17:30:32,147 INFO
org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: ubuntu is accessing
unchecked http://bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports
which is the app master GUI of application_1516910361403_0094 owned by ubuntu
2018-01-26 17:31:13,639 WARN org.mortbay.log:
/proxy/application_1516910361403_0094/ws/v1/slider/publisher/exports:
java.net.ConnectException: Connection timed out (Connection timed out)
{quote}
Note how the redirect is to
[http://bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports,] where
as it should have been to
[http://f-bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports.]
Renaming the host to f-bcpc-vm3 results in appropriate behavior.
> Slider does not work well in multi homed environments
> -----------------------------------------------------
>
> Key: SLIDER-1259
> URL: https://issues.apache.org/jira/browse/SLIDER-1259
> Project: Slider
> Issue Type: Bug
> Components: agent
> Affects Versions: Slider 0.92
> Reporter: Lev Bronshtein
> Priority: Minor
>
> In an an environment where Hadoop Worker nodes bind the Node Manager to an
> interface with a hostname different from the one returned by socket.getfqdn()
> for example in our test environment a difference between f-bcpc-vm3 and just
> bcpc-vm3, which is the hostname bound to the management interface, but not
> the interface for hadoop/production traffic. This results in our inability
> to introspect running jobs.
>
> For example running *slider registry --name slider_poc --listexp* results in
> the following output in the ResourceManager logs
> {quote}2018-01-26 17:30:32,147 INFO
> org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: ubuntu is
> accessing unchecked
> [http://bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports] which
> is the app master GUI of application_1516910361403_0094 owned by ubuntu
> 2018-01-26 17:31:13,639 WARN org.mortbay.log:
> /proxy/application_1516910361403_0094/ws/v1/slider/publisher/exports:
> java.net.ConnectException: Connection timed out (Connection timed out)
> {quote}
>
> Note how the redirect is to
> [http://bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports,]
> where as it should have been to
> [http://f-bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports.]
> Renaming the host to f-bcpc-vm3 results in appropriate behavior.
>
> perhaps *hostname.py* can be instructed to look at one of before registering
> *yarn.nodemanager.address*
> *yarn.nodemanager.bind-host*
> *yarn.nodemanager.hostname*
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)