Re: Giraph 1.0 | Computation stuck at map 100% - reduce 0% for my algorithm only, at multi-node cluster

Panagiotis Eustratiadis Tue, 30 Sep 2014 05:57:29 -0700

Hello Matthew, thanks for the answer.

Oddly enough the job isn't listed under the failed jobs, as it is still
running. The execution never ends unless I kill it from the command line.
And I did check the logs (I always do), but they don't say anything. By the
way I see no syslog, only stdout and stderr.


What I didn't mention in my previous post and it might help, is that the
algorithm executes perfectly on a single node cluster setup. And from the
fact that the SimpleShortestPathVertex runs on the multi node setup just
fine, we deduce that the multi node setup is correct (right?).

2014-09-30 15:24 GMT+03:00 Matthew Cornell <m...@matthewcornell.org>:

> I'm new, but in my meager experience when it stops at map 100% it means
> there was an error somewhere. In Giraph I've often found it difficult to
> pin down what that error actually was (e.g., out of memory), but the logs
> are the first place to look. Just to clarify re: not finding outputs: Are
> you going to http://<your_host.com>:50030/jobtracker.jsp and clicking on
> the failed job id (e.g., job_201409251209_0029 -> 
> http://<your_host.com>:50030/jobdetails.jsp?jobid=job_201409251209_0029&refresh=0
> )? From there, click the "map" link in the table to see its tasks. (Giraph
> runs entirely as a map task, IIUC.) You should see tasks for the master
> plus your workers. If you click on one of them (e.g.,
> task_201409251209_0029_m_000000 -> 
> http://<your_host.com>:50030/taskdetails.jsp?tipid=task_201409251209_0029_m_000000
> ) you should see what machine it ran on plus a link to the Task Logs. Click
> on "All" and you should see three sections for stdout, stderr, and syslog,
> the latter of which usually contains hints about what went wrong. You
> should check all the worker logs.
>
> Hope that helps.
>
>
> On Tue, Sep 30, 2014 at 2:53 AM, Panagiotis Eustratiadis <
> ep.pan....@gmail.com> wrote:
>
>> Good morning,
>>
>> I have been having a problem the past few days which sadly I can't solve.
>>
>> First of all I set up a Hadoop 0.20.203.0 cluster of two nodes a master
>> and a slave. I followed this tutorial for the settings:
>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>>
>> Then I set up Giraph, and I built it properly with maven. When I run the
>> SimpleShortestPathVertex with number of workers = 2 it runs properly, and
>> gives me results which I can view from any of the two nodes. Also the
>> jobtracker at master:50030 and slave:50030 and everything else is working
>> as expected.
>>
>> However, when I try to run my own algorithm it hangs at map 100% reduce
>> 0% forever. I looked at SimpleShortestPathVertex for any configurations and
>> it has none. And the weird part is: the jobs at the jobtracker have no logs
>> at stdout or stderr. The only thing readable is the map task info:
>>
>> task_201409300940_0001_m_000000 | 100.00% - MASTER_ZOOKEEPER_ONLY | 1
>> finished out of 2 on superstep -1
>> task_201409300940_0001_m_000001 | 100.00% | startSuperstep: WORKER_ONLY -
>> Attempt=0, Superstep=-1
>> task_201409300940_0001_m_000002 | 100.00% | startSuperstep: WORKER_ONLY -
>> Attempt=0, Superstep=-1
>>
>> Is there anything I'm overlooking? I have Googled the obvious stack
>> overflow solutions for two days now. Has anyone encountered anything
>> similar?
>>
>> Regards,
>> Panagiotis Eustratiadis.
>>
>
>
>
> --
> Matthew Cornell | m...@matthewcornell.org | 413-626-3621 | 34 Dickinson
> Street, Amherst MA 01002 | matthewcornell.org
>

Re: Giraph 1.0 | Computation stuck at map 100% - reduce 0% for my algorithm only, at multi-node cluster

Reply via email to