If you use open vpn for accessing aws then you can use private IP of ec2
machine from your laptop.


Thanks
Ashutosh

On Tue, Jun 7, 2016 at 11:00 PM, Shannon Carey <sca...@expedia.com> wrote:

> We're also starting to look at automating job deployment/start to Flink
> running on EMR. There are a few options:
>
>    - Use RemoteExecutionEnvironment (per the examples). Problems: not
>    sure best way to upload JAR, not sure how to run it detached so that the
>    Java program that starts the job is asynchronous with the long-running
>    cluster job.
>    - Use the CLI. Problems: need to run it locally on the YARN node,
>    otherwise you encounter the problems discussed below? It requires a Flink
>    distro. Logs of the launch will remain local to the machine that executes
>    it (eg. if it's on a Jenkins slave)
>    - Use the HTTP API
>
> Is using the HTTP API a reasonable approach? Is that API considered stable
> enough that we could rely on it continuing to be present?
>
> Thanks,
> Shannon
>
>
> From: "Bajaj, Abhinav" <abhinav.ba...@here.com>
> Date: Monday, June 6, 2016 at 12:10 PM
> To: Josh <jof...@gmail.com>
> Cc: "user@flink.apache.org" <user@flink.apache.org>
>
> Subject: Re: Submit Flink Jobs to YARN running on AWS
>
> Hi Josh,
>
> I have not yet :-( . I am working on getting a REST service setup on AWS
> that can do it rather than using Flink client remotely.
> This way the AKKA communication is within AWS.
>
> However, I still need the solution for running some of the
> integration/system tests.
>
> ~ Abhi
>
> From: Josh <jof...@gmail.com>
> Reply-To: "user@flink.apache.org" <user@flink.apache.org>
> Date: Monday, June 6, 2016 at 11:55 AM
> To: "user@flink.apache.org" <user@flink.apache.org>
> Subject: Re: Submit Flink Jobs to YARN running on AWS
>
> Hi Abhi,
>
> I'm also looking to deploy Flink jobs remotely to YARN, and eventually
> automate it - just wondering if you found a way to do it?
>
> Thanks,
> Josh
>
> On Wed, May 25, 2016 at 12:36 AM, Bajaj, Abhinav <abhinav.ba...@here.com>
> wrote:
>
>> Hi,
>>
>> Has anyone tried to submit a Flink Job remotely to Yarn running in AWS ?
>> The case I am stuck with is where the Flink client is on my laptop and
>> YARN is running on AWS.
>>
>> @Robert, Did you get a chance to try this out?
>>
>> Regards,
>> Abhi
>>
>> From: "Bajaj, Abhinav" <abhinav.ba...@here.com>
>> Date: Friday, April 29, 2016 at 3:50 PM
>>
>> To: "user@flink.apache.org" <user@flink.apache.org>
>> Subject: Re: Submit Flink Jobs to YARN running on AWS
>>
>> Hi Robert,
>>
>> Thanks for your reply.
>>
>> I am using the Public DNS for the EC2 machines in the yarn and hdfs
>> configuration files. It looks like "
>> ec2-203-0-113-25.compute-1.amazonaws.com”
>> You should be able to connect then.
>>
>> I have hadoop installed locally and the YARN_CONF_DIR is pointing to it.
>> The yarn-site.xml and core-site.xml files use the resource manager
>> address(Public DNS) running in AWS.
>>
>> So, whenever I submit the job using the client on my laptop, it connects
>> to RM.
>> The RM starts the YARN application and starts the Job manager.
>> The job manager starts the actor system using the internal IP of the
>> nodemanager. In my understanding, this is where the problem lies.
>>
>> The local client tries to connect to the Job manager actor system but the
>> messages are dropped by the actor system as the IP address(EC2 internal IP)
>> that actor system started with does not match the external IP
>> address(Public IP) that was used by Flink client to send the message.
>> Please see my first mail below for detailed logs.
>>
>> Please keep me posted with your progress.
>>
>> I plan to move the cluster to VPC for other reasons.
>> I have limited knowledge of VPC but I guess the difference in internal
>> and external IP address will not be resolved.
>> Please correct if your views are different.
>>
>> It will be great if you are able to reproduce the issue.
>>
>> Thanks again.
>> Abhi
>>
>>
>>
>> *[image: cid:DACBF116-FD8C-48DB-B91D-D54510B306E8]*
>>
>> *Abhinav Bajaj*
>>
>> Senior Engineer
>>
>> HERE Predictive Analytics
>>
>> Office:  +12062092767
>>
>> Mobile: +17083299516
>>
>> *HERE Seattle*
>>
>> 701 Pike Street, #2000, Seattle, WA 98101, USA
>>
>> *47° 36' 41" N. 122° 19' 57" W*
>>
>> *HERE Maps*
>>
>>
>>
>>
>> From: Robert Metzger <rmetz...@apache.org>
>> Reply-To: "user@flink.apache.org" <user@flink.apache.org>
>> Date: Tuesday, April 26, 2016 at 3:16 AM
>> To: "user@flink.apache.org" <user@flink.apache.org>
>> Subject: Re: Submit Flink Jobs to YARN running on AWS
>>
>> I've started my own EMR cluster and tried to launch a Flink job from my
>> local machine on it.
>> I have to admin that configuring the EMR launched Hadoop for external
>> access is quite a hassle.
>>
>> I'm not even able to submit Flink to the YARN cluster because the client
>> can not connect to the ResourceManager. I've change the resource manager
>> hostname to the public one in the yarn-site.xml on the cluster and
>> restarted it, but the client still can not connect.
>> It seems that the RM address is being overwritten by the Hadoop code?
>> [image: Inline image 1]
>>
>> How did you manage to get this working?
>>
>> In the VM settings, I disabled the "Source/Dest checks", but I don't
>> think this is related.
>>
>> Have you considered using Amazon's VPN service, I guess then you would
>> have "local" access to the cluster?
>>
>> On YARN, Flink is not using the flink-conf.yaml setting for the
>> jobmanager's hostname. Its using YARN's "yarn.nodemanager.hostname" from
>> the yarn-site.xml.
>> I haven't tried it, but it could work if you set the public hostname of
>> each NodeManager in the yarn-site.xml.
>>
>> Also, maybe the product forum / customer support of Amazon can help you
>> here. Other systems like Spark or Storm have very similar architectures and
>> will face the same issues. I guess they have some recipes for such
>> situations.
>>
>> Regards,
>> Robert
>>
>>
>>
>>
>> On Tue, Apr 26, 2016 at 10:47 AM, Robert Metzger <rmetz...@apache.org>
>> wrote:
>>
>>> Hi Abhi,
>>>
>>> I'll try to reproduce the issue and come up with a solution.
>>>
>>> On Tue, Apr 26, 2016 at 1:13 AM, Bajaj, Abhinav <abhinav.ba...@here.com>
>>> wrote:
>>>
>>>> Hi Fabian,
>>>>
>>>> Thanks for your reply and the pointers to documentation.
>>>>
>>>> In these steps, I think the Flink client is installed on the master
>>>> node, referring to steps mentioned in Flink docs here
>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.0/setup/aws.html>
>>>> .
>>>> However, the scenario I have is to run the client on my local machine
>>>> and submit jobs remotely to the YARN Cluster (running on EMR or
>>>> independently).
>>>>
>>>> Let me describe in more detail here.
>>>> I am trying to submit a single Flink Job to YARN using the client,
>>>> running on my dev machine -
>>>>
>>>> ./bin/flink run -m yarn-cluster -yn 4 -yjm 1024 -ytm 4096
>>>>  ./examples/batch/WordCount.jar
>>>>
>>>> In my understanding, YARN (running in AWS) allocates a container for
>>>> the Jobmanager.
>>>> Jobmanager discovers the IP and started the Actor system. At this step
>>>> the IP it uses is the internal IP address, of the EC2 instance.
>>>>
>>>> The client, running on my dev machine, is not able to connect to the
>>>> Jobmanager for reasons explained in my mail below.
>>>>
>>>> Is there a way, where I can set Jobmanager to use the hostname and not
>>>> the IP address?
>>>>
>>>> Or any other suggestions?
>>>>
>>>> Thanks,
>>>> Abhi
>>>>
>>>> *[image: cid:DACBF116-FD8C-48DB-B91D-D54510B306E8]*
>>>>
>>>> *Abhinav Bajaj*
>>>>
>>>> Senior Engineer
>>>>
>>>> HERE Predictive Analytics
>>>>
>>>> Office:  +12062092767
>>>>
>>>> Mobile: +17083299516
>>>>
>>>> *HERE Seattle*
>>>>
>>>> 701 Pike Street, #2000, Seattle, WA 98101, USA
>>>>
>>>> *47° 36' 41" N. 122° 19' 57" W*
>>>>
>>>> *HERE Maps*
>>>>
>>>>
>>>>
>>>>
>>>> From: Fabian Hueske <fhue...@gmail.com>
>>>> Reply-To: "user@flink.apache.org" <user@flink.apache.org>
>>>> Date: Wednesday, March 9, 2016 at 12:51 AM
>>>> To: "user@flink.apache.org" <user@flink.apache.org>
>>>> Subject: Re: Submit Flink Jobs to YARN running on AWS
>>>>
>>>> Hi Abhi,
>>>>
>>>> I have used Flink on EMR via YARN a couple of times without problems.
>>>> I started a Flink YARN session like this:
>>>>
>>>> ./bin/yarn-session.sh -n 4 -jm 1024 -tm 4096
>>>>
>>>> This will start five YARN containers (1 JobManager with 1024MB, 4
>>>> Taskmanagers with 4096MB). See more config options in the documentation 
>>>> [1].
>>>> In one of the last lines of the std-out output you should find a line
>>>> that tells you the IP and port of the JobManager.
>>>>
>>>> With the IP and port, you can submit a job as follows:
>>>>
>>>> ./bin/flink run -m jmIP:jmPort -p 4 jobJarFile.jar <arguments>
>>>>
>>>> This will send the job to the JobManager specified by IP and port and
>>>> execute the program with a parallelism of 4. See more config options in the
>>>> documentation [2].
>>>>
>>>> If this does not help, could you share the exact command that you use
>>>> to start the YARN session and submit the job?
>>>>
>>>> Best, Fabian
>>>>
>>>> [1]
>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.0/setup/yarn_setup.html
>>>> [2]
>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/cli.html
>>>>
>>>> 2016-03-08 0:25 GMT+01:00 Bajaj, Abhinav <abhinav.ba...@here.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am a newbie to Flink and trying to use it in AWS.
>>>>> I have created a YARN cluster on AWS EC2 machines.
>>>>> Trying to submit Flink job to the remote YARN cluster using the Flink
>>>>> Client running on my local machine.
>>>>>
>>>>> The Jobmanager start successfully on the YARN container but the client
>>>>> is not able to connect to the Jobmanager.
>>>>>
>>>>> Flink Client Logs -
>>>>>
>>>>> 13:57:34,877 INFO  org.apache.flink.yarn.FlinkYarnClient
>>>>>           - Deploying cluster, current state ACCEPTED
>>>>> 13:57:35,951 INFO  org.apache.flink.yarn.FlinkYarnClient
>>>>>           - Deploying cluster, current state ACCEPTED
>>>>> 13:57:37,027 INFO  org.apache.flink.yarn.FlinkYarnClient
>>>>>           - YARN application has been deployed successfully.
>>>>> 13:57:37,100 INFO  org.apache.flink.yarn.FlinkYarnCluster
>>>>>            - Start actor system.
>>>>> 13:57:37,532 INFO  org.apache.flink.yarn.FlinkYarnCluster
>>>>>            - Start application client.
>>>>> YARN cluster started
>>>>> JobManager web interface address
>>>>> http://ec2-XX-XX-XX-XX.compute-1.amazonaws.com:8088/proxy/application_1456184947990_0003/
>>>>> Waiting until all TaskManagers have connected
>>>>> 13:57:37,540 INFO  org.apache.flink.yarn.ApplicationClient
>>>>>           - Notification about new leader address akka.tcp:
>>>>> //flink@54.35.41.12:41292/user/jobmanager with session ID null.
>>>>> No status updates from the YARN cluster received so far. Waiting ...
>>>>> 13:57:37,543 INFO  org.apache.flink.yarn.ApplicationClient
>>>>>           - Received address of new leader akka.tcp:
>>>>> //flink@54.35.41.12:41292/user/jobmanager with session ID null.
>>>>> 13:57:37,543 INFO  org.apache.flink.yarn.ApplicationClient
>>>>>           - Disconnect from JobManager null.
>>>>> 13:57:37,545 INFO  org.apache.flink.yarn.ApplicationClient
>>>>>           - Trying to register at JobManager akka.tcp:
>>>>> //flink@54.35.41.12:41292/user/jobmanager.
>>>>> No status updates from the YARN cluster received so far. Waiting ...
>>>>>
>>>>> The logs of the Jobmanager contains the following -
>>>>>
>>>>> 21:57:39,142 ERROR akka.remote.EndpointWriter                             
>>>>>        - dropping message [class akka.actor.ActorSelectionMessage] for 
>>>>> non-local recipient [Actor[akka.tcp://flink@54.35.41.12:41292/]] arriving 
>>>>> at [akka.tcp://flink@54.35.41.12:41292] inbound addresses are 
>>>>> [akka.tcp://flink@172.31.23.18:41292]
>>>>> 21:57:40,782 INFO  org.apache.flink.runtime.instance.InstanceManager      
>>>>>        - Registered TaskManager at ec2-54-35-41-12 
>>>>> (akka.tcp://flink@172.31.23.18:60565/user/taskmanager) as 
>>>>> 72101dd2ee94caa7a5ec5a75488359aa. Current number of registered hosts is 
>>>>> 1. Current number of alive task slots is 1.
>>>>> 21:57:41,162 ERROR akka.remote.EndpointWriter                             
>>>>>        - dropping message [class akka.actor.ActorSelectionMessage] for 
>>>>> non-local recipient [Actor[akka.tcp://flink@54.35.41.12:41292/]] arriving 
>>>>> at [akka.tcp://flink@54.35.41.12:41292] inbound addresses are 
>>>>> [akka.tcp://flink@172.31.23.18:41292]
>>>>>
>>>>> It seems the problem is in the mismatch of the Jobmanager Akka actors
>>>>> system running address and the one user by the Client.
>>>>> 172.31.23.18 – is the internal private IP of the EC2 machine where the
>>>>> Jobmanager container is running.
>>>>> 54.35.41.12 – is the external IP of the EC2 machine, used by Flink
>>>>> client to submit the Job.
>>>>> Because of this mismatch the messages are ignored by the Akka actor
>>>>> System.
>>>>>
>>>>> Can someone please help me with this issue.
>>>>> I can share the detailed logs, if required.
>>>>>
>>>>> Thanks,
>>>>> Abhi
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to