Re: hadoop job stuck.

2015-01-16 Thread craig mcmillan
dan, to function correctly the mesos web ui requires that the slave ip addresses the master uses are directly accessible from your browser to : it doesn't work if your slaves are not accessible to your browser on the same ip address that the master uses, such as happens when the master uses a

Re: hadoop job stuck.

2015-01-16 Thread Dick Davies
To view the slaves logs, you need to be able to connect to that URL from your browser, not the master (the data is read directly from the slave by your browser, it doesn't go via the master). On 15 January 2015 at 21:42, Dan Dong wrote: > Hi, All, > Now sandbox could be viewed on mesos UI, I s

Re: hadoop job stuck.

2015-01-15 Thread Dan Dong
Hi, All, Now sandbox could be viewed on mesos UI, I see the following info(* The same error appears on every slave sandbox.)*: *"Failed to connect to slave '20150115-144719-3205108908-5050-4552-S0' on 'centos-2.local:5051'.* Potential reasons: - The slave's hostname, 'centos-2.local', is not

Re: hadoop job stuck.

2015-01-14 Thread Brenden Matthews
Would need the task logs from the slave which the TaskTracker was launched on, to debug this further. On Wed, Jan 14, 2015 at 1:28 PM, Dan Dong wrote: > Checked /etc/hosts is correct, master and slave can ssh login each other > by hostname without password, and hadoop runs well without mesos, bu

Re: hadoop job stuck.

2015-01-14 Thread Dan Dong
Checked /etc/hosts is correct, master and slave can ssh login each other by hostname without password, and hadoop runs well without mesos, but it stucks when running on mesos. Cheers, Dan 2015-01-14 15:02 GMT-06:00 Brenden Matthews : > At a first glance, it looks like `/etc/hosts` might be set i

Re: hadoop job stuck.

2015-01-14 Thread Dan Dong
Hi, All, I'm using hadoop-2.5.0-cdh5.2.0 on mesos 0.21.0, the following are from master log of mesos: I0114 14:47:04.855033 17756 master.cpp:1768] Deactivating framework 20150114-144506-3205108908-5050-17750- (Hadoop: (RPC port: 9001, WebUI port: 50030)) at scheduler-d37097ae-8b55-4f73-9936-1

Re: hadoop job stuck.

2015-01-14 Thread Brenden Matthews
At a first glance, it looks like `/etc/hosts` might be set incorrectly and it cannot resolve the hostname of the worker. See here for more: https://wiki.apache.org/hadoop/UnknownHost On Wed, Jan 14, 2015 at 12:32 PM, Vinod Kone wrote: > What do the master logs say? > > On Wed, Jan 14, 2015 at 1

Re: hadoop job stuck.

2015-01-14 Thread Vinod Kone
What do the master logs say? On Wed, Jan 14, 2015 at 12:21 PM, Dan Dong wrote: > Hi, > When I run hadoop jobs on Mesos(0.21.0), the jobs are stuck for ever: > 15/01/14 13:59:30 INFO mapred.FileInputFormat: Total input paths to > process : 8 > 15/01/14 13:59:30 INFO mapred.JobClient: Running jo

Re: hadoop job stuck.

2015-01-14 Thread Tom Arnfeld
Hi Dan, Can you look at the stdout/stderr logs in the task sandbox for me and share any errors here? Also – What version of Hadoop are you using, and what version of the Hadoop on Mesos framework? Thanks. -- Tom Arnfeld Developer // DueDil On Wednesday, Jan 14, 2015 at 8:2

hadoop job stuck.

2015-01-14 Thread Dan Dong
Hi, When I run hadoop jobs on Mesos(0.21.0), the jobs are stuck for ever: 15/01/14 13:59:30 INFO mapred.FileInputFormat: Total input paths to process : 8 15/01/14 13:59:30 INFO mapred.JobClient: Running job: job_201501141358_0001 15/01/14 13:59:31 INFO mapred.JobClient: map 0% reduce 0% >From j