Mh, that's weird. Maybe both resource managers are marked as "standby"? Not
sure what can cause this issue.

Which YARN version are you using? Maybe you need to build Flink against
that specific hadoop version yourself.

On Mon, Feb 8, 2016 at 5:50 PM, Pieter Hameete <phame...@gmail.com> wrote:

> After downloading and building the 1.0-SNAPSHOT from the master branch I
> do run into another problem when starting a YARN cluster. The startup now
> infinitely loops at the following step:
>
> 17:39:12,369 INFO
> org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider  - Failing
> over to rm2
> 17:39:34,855 INFO
> org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider  - Failing
> over to rm1
>
> Any clue what couldve gone wrong? I used all-default for building with
> maven.
>
> - Pieter
>
>
>
> 2016-02-08 17:07 GMT+01:00 Pieter Hameete <phame...@gmail.com>:
>
>> Matter of RTFM eh ;-) thx and sorry for the bother.
>>
>> 2016-02-08 17:06 GMT+01:00 Robert Metzger <rmetz...@apache.org>:
>>
>>> You said earlier that you are using Flink 0.10. The feature is only
>>> available in 1.0-SNAPSHOT.
>>>
>>> On Mon, Feb 8, 2016 at 4:53 PM, Pieter Hameete <phame...@gmail.com>
>>> wrote:
>>>
>>>> Ive tried setting the yarn.application-master.port property in
>>>> flink-conf.yaml to a range suggested in
>>>> https://ci.apache.org/projects/flink/flink-docs-master/setup/yarn_setup.html#running-flink-on-yarn-behind-fi
>>>> rewalls
>>>>
>>>> The JobManager does not seem to be picking the property up. Am I
>>>> setting this in the wrong place? Or is there another way to enforce this
>>>> property?
>>>>
>>>> Cheers,
>>>>
>>>> Pieter
>>>>
>>>> 2016-02-07 20:04 GMT+01:00 Pieter Hameete <phame...@gmail.com>:
>>>>
>>>>> I found the relevant information on the website. Ill consult with the
>>>>> cluster admin tomorrow, thanks for the help :-)
>>>>>
>>>>> - Pieter
>>>>>
>>>>> 2016-02-07 19:31 GMT+01:00 Robert Metzger <rmetz...@apache.org>:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> we had other users with a similar issue as well. There is a
>>>>>> configuration value which allows you to specify a single port or a range 
>>>>>> of
>>>>>> ports for the JobManager to allocate when running on YARN.
>>>>>> Note that when using this with a single port, the JMs may collide.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 7, 2016 at 7:25 PM, Pieter Hameete <phame...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Stephan,
>>>>>>>
>>>>>>> surely it seems this way! I must not be the first with this issue
>>>>>>> though? I'll have to contact the cluster admins to find a solution
>>>>>>> together. What would be a way of make the JobManagers accessible from
>>>>>>> outside the network, because the IP and port number changes every time.
>>>>>>>
>>>>>>> Alternatively, I can ask for ssh access to a node within the
>>>>>>> network. that will surely work but it's not my preferred solution.
>>>>>>>
>>>>>>> - Pieter
>>>>>>>
>>>>>>> 2016-02-06 16:22 GMT+01:00 Stephan Ewen <se...@apache.org>:
>>>>>>>
>>>>>>>> Yeah, sounds a lot like the client cannot connect to the JobManager
>>>>>>>> port.
>>>>>>>>
>>>>>>>> The ports to communicate with HDFS and the YARN resource manager
>>>>>>>> may be whitelisted r forwarded, so you can submit the YARN session, but
>>>>>>>> then not connect to the JobManager afterwards.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Feb 6, 2016 at 2:11 PM, Pieter Hameete <phame...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Max!
>>>>>>>>>
>>>>>>>>> I'm using Flink 0.10.1 and indeed the cluster seems to be created
>>>>>>>>> fine, all in the JobManager Web UI looks good.
>>>>>>>>>
>>>>>>>>> It seems like the JobManager initiates the connection with my VM
>>>>>>>>> and cannot reach it. It could be that this is similar to the problem 
>>>>>>>>> here:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/spark-with-docker-errors-with-akka-NAT-td7702.html
>>>>>>>>>
>>>>>>>>> I probably have to make some changes to the networking
>>>>>>>>> configuration of my VM so it can be reached by the JobManager despite 
>>>>>>>>> using
>>>>>>>>> a different port each time.
>>>>>>>>>
>>>>>>>>> - Pieter
>>>>>>>>>
>>>>>>>>> 2016-02-06 14:05 GMT+01:00 Maximilian Michels <m...@apache.org>:
>>>>>>>>>
>>>>>>>>>> Hi Pieter,
>>>>>>>>>>
>>>>>>>>>> Which version of Flink are you using? It appears you've created a
>>>>>>>>>> Flink YARN cluster but you can't reach the JobManager afterwards.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Max
>>>>>>>>>>
>>>>>>>>>> On Sat, Feb 6, 2016 at 1:42 PM, Pieter Hameete <
>>>>>>>>>> phame...@gmail.com> wrote:
>>>>>>>>>> > Hi Robert,
>>>>>>>>>> >
>>>>>>>>>> > unfortunately there are no signs of what is going wrong in the
>>>>>>>>>> logs. The
>>>>>>>>>> > last log messages are about succesful registration of the
>>>>>>>>>> TaskManagers.
>>>>>>>>>> >
>>>>>>>>>> > I'm also fairly sure it must be something in my VM that is
>>>>>>>>>> causing this,
>>>>>>>>>> > because when I start the yarn-session from a login node that is
>>>>>>>>>> on the same
>>>>>>>>>> > network as the hadoop cluster there are no problems registering
>>>>>>>>>> with the
>>>>>>>>>> > JobManager. I did also notice the following message in the
>>>>>>>>>> local console:
>>>>>>>>>> >
>>>>>>>>>> > 12:30:27,173 WARN  Remoting
>>>>>>>>>> > - Tried to associate with unreachable remote address
>>>>>>>>>> > [akka.tcp://flink@145.100.41.13:41539]. Address is now gated
>>>>>>>>>> for 5000 ms,
>>>>>>>>>> > all messages to this address will be delivered to dead letters.
>>>>>>>>>> Reason:
>>>>>>>>>> > connection timed out: /145.100.41.13:41539
>>>>>>>>>> >
>>>>>>>>>> > I can ping the JobManager fine from with VM. Could there be
>>>>>>>>>> some invalid or
>>>>>>>>>> > missing configuration on my side?
>>>>>>>>>> >
>>>>>>>>>> > Cheers,
>>>>>>>>>> >
>>>>>>>>>> > Pieter
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> > 2016-02-06 12:54 GMT+01:00 Robert Metzger <rmetz...@apache.org
>>>>>>>>>> >:
>>>>>>>>>> >>
>>>>>>>>>> >> Hi,
>>>>>>>>>> >>
>>>>>>>>>> >> did you check the logs of the JobManager itself? Maybe it'll
>>>>>>>>>> tell us
>>>>>>>>>> >> already whats going on.
>>>>>>>>>> >>
>>>>>>>>>> >> On Sat, Feb 6, 2016 at 12:14 PM, Pieter Hameete <
>>>>>>>>>> phame...@gmail.com>
>>>>>>>>>> >> wrote:
>>>>>>>>>> >>>
>>>>>>>>>> >>> Hi Guys!
>>>>>>>>>> >>>
>>>>>>>>>> >>> Im attempting to run Flink on YARN, but I run into an issue.
>>>>>>>>>> Im starting
>>>>>>>>>> >>> the Flink YARN session from an Ubuntu 14.04 VM. All goes well
>>>>>>>>>> until after
>>>>>>>>>> >>> the JobManager web UI is started:
>>>>>>>>>> >>>
>>>>>>>>>> >>> JobManager web interface address
>>>>>>>>>> >>>
>>>>>>>>>> http://head05.hathi.surfsara.nl:8088/proxy/application_1452780322684_10532/
>>>>>>>>>> >>> Waiting until all TaskManagers have connected
>>>>>>>>>> >>> 11:09:51,557 INFO  org.apache.flink.yarn.ApplicationClient
>>>>>>>>>> >>> - Notification about new leader address
>>>>>>>>>> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager with
>>>>>>>>>> session ID null.
>>>>>>>>>> >>> No status updates from the YARN cluster received so far.
>>>>>>>>>> Waiting ...
>>>>>>>>>> >>> 11:09:51,578 INFO  org.apache.flink.yarn.ApplicationClient
>>>>>>>>>> >>> - Received address of new leader
>>>>>>>>>> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager with
>>>>>>>>>> session ID null.
>>>>>>>>>> >>> 11:09:51,583 INFO  org.apache.flink.yarn.ApplicationClient
>>>>>>>>>> >>> - Disconnect from JobManager null.
>>>>>>>>>> >>> 11:09:51,595 INFO  org.apache.flink.yarn.ApplicationClient
>>>>>>>>>> >>> - Trying to register at JobManager
>>>>>>>>>> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager.
>>>>>>>>>> >>> No status updates from the YARN cluster received so far.
>>>>>>>>>> Waiting ...
>>>>>>>>>> >>> No status updates from the YARN cluster received so far.
>>>>>>>>>> Waiting ...
>>>>>>>>>> >>>
>>>>>>>>>> >>> It then hangs on these last steps (trying to register, no
>>>>>>>>>> status
>>>>>>>>>> >>> updates..)
>>>>>>>>>> >>>
>>>>>>>>>> >>> Im sure there must be a problem on my side that is causing me
>>>>>>>>>> not to be
>>>>>>>>>> >>> able to register at the JobManager. What could cause such
>>>>>>>>>> connection
>>>>>>>>>> >>> problems?
>>>>>>>>>> >>>
>>>>>>>>>> >>> Any tips are very welcome :-)
>>>>>>>>>> >>>
>>>>>>>>>> >>> Cheers and have a good weekend!
>>>>>>>>>> >>>
>>>>>>>>>> >>> - Pieter
>>>>>>>>>> >>>
>>>>>>>>>> >>>
>>>>>>>>>> >>
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to