Hey Sid,
as a background, for map-reduce we’re configuring
yarn.ipc.client.factory.class
hadoop.rpc.socket.factory.class.default
to an homegrown socket-factory which does the translation between ec2 internal
to external addresses.
For Tez this does not seem to have any effect, even if i’m using
NetUtils.createSocketAddrForHost() like you suggested.
However, using NetUtils.createSocketAddrForHost() would allow to add the
translation between the addresses through NetUtils.addStaticResolution() as i
figured out. So doing this change would help.
Should i create a ticket for that ?
Johannes
On 13 Aug 2014, at 10:30, Siddharth Seth <[email protected]> wrote:
> Johannes
> Getting the client to pick the correct IP to use is a little tricky. Hadoop
> itself has some utilities for this, which we could try using. Could you open
> a jira for this please ?, and we'll need some help trying it out.
>
> If you're building the Tez code base locally - could you try the following
> change
>
> TezClientUtils:820
> Replace
> final InetSocketAddress serviceAddr = new InetSocketAddress(amHost,
> amRpcPort);
> with
> final InetSocketAddress serviceAddr =
> NetUtils.createSocketAddrForHost(amHost, amRpcPort);
>
>
> On Wed, Aug 13, 2014 at 12:58 AM, Johannes Zillmann
> <[email protected]> wrote:
> Hey Hitesh,
>
> so without chaining the hostname of the ec2 instances to their public dns the
> log looks like:
> 2014-08-13 03:53:15,310 INFO [ServiceThread:DAGClientRPCServer]
> org.apache.tez.dag.api.client.DAGClientServer: Instantiated
> DAGClientRPCServer at domU-12-31-39-0F-30-03/10.193.51.241:31000
> 2014-08-13 03:53:15,332 INFO [IPC Server Responder]
> org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2014-08-13 03:53:15,336 INFO [IPC Server listener on 50192]
> org.apache.hadoop.ipc.Server: IPC Server listener on 50192: starting
>
> and the exception on the client is then:
> com.google.protobuf.ServiceException: java.net.UnknownHostException: Invalid
> host name: local host is: (unknown); destination host is:
> "domU-12-31-39-0F-30-03":31000; java.net.UnknownHostException; For more
> details see: http://wiki.apache.org/hadoop/UnknownHost
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
> at com.sun.proxy.$Proxy37.getAMStatus(Unknown Source)
> at
> org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:503)
> at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:576)
>
> HTH
> Johannes
>
> On 12 Aug 2014, at 22:05, Hitesh Shah <[email protected]> wrote:
>
> > It seems like the AM is binding to the external/public hostname and not the
> > internal IP.
> >
> > Could you look for this log message in the AM logs: "Instantiated
> > DAGClientRPCServer at”. This will provide some information as to what the
> > AM is binding to.
> >
> > thanks
> > — Hitesh
> >
> > On Aug 11, 2014, at 7:43 AM, Johannes Zillmann <[email protected]>
> > wrote:
> >
> >> Hey guys,
> >>
> >> having a test-infrastructure for Hadoop on ec2. The client sits usually
> >> outside of ec2.
> >> Using plain map-reduce on YARN everything works fine.
> >> Using Tez i run into following exception:
> >>
> >> INFO [2014-07-29 00:09:06.653] [MrPlanRunnerV2] (TezClient.java:507) -
> >> Failed to retrieve AM Status via proxy
> >> com.google.protobuf.ServiceException:
> >> org.apache.hadoop.net.ConnectTimeoutException: Call From
> >> ip-10-73-6-154.ec2.internal/10.73.6.154 to
> >> ec2-54-81-245-144.compute-1.amazonaws.com:60914 failed on socket timeout
> >> exception: org.apache.hadoop.net.ConnectTimeoutException: connect timed
> >> out; For more details see: http:
> >> //wiki.apache.org/hadoop/SocketTimeout
> >>
> >> at
> >> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
> >> at com.sun.proxy.$Proxy116.getAMStatus(Unknown Source)
> >> at
> >> org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:500)
> >> at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:586)
> >>
> >>
> >> I could resolve the problem for Tez changing the hostname of the instances
> >> to their public dns’. However, that is causing problems with other
> >> components.
> >> Do you know of any place in Tez which is related to that ? Any tweak which
> >> could make chaining the hostname superfluous ?
> >>
> >> best
> >> Johannes
> >
>
>