Hi, it might be a memory issue as you suggest. Could you also check the logs from HDFS?
Also, can you run the program with a smaller input? Anastasis On 6 Σεπ 2013, at 8:03 μ.μ., Mahesh Babu <[email protected]> wrote: > Hi Anastasis, > > Yes I am able to run in standalone(local) mode. I had to increase the RAM > size from 1GB to 2GB for the VM that runs the standalone program. The topo > size is approx. 1500k vertices. > > When in pseudo mode, the job didnot run as it is. I had to increase max bsp > tasks from 4 to 5 (previous error). After increasing it to 5, it proceeds > further and then fails like above log. > > Thanks, > Mahesh Babu > > > On Fri, Sep 6, 2013 at 3:25 PM, Anastasis Andronidis < > [email protected]> wrote: > >> Hello, >> >> can you run your code on standalone mode so you can be sure that the >> problem is not on your code? >> >> Kindly, >> Anastasis >> >> On 6 Σεπ 2013, at 12:35 μ.μ., Mahesh Babu <[email protected]> wrote: >> >>> Hi, >>> >>> When I run a hama job in pseudo distributed mode (single node) I get >>> following error: (in stdout) >>>>>>>>>>>>>>> >>> attempt_201309061315_0005_000000_0: 13/09/06 14:01:39 DEBUG >>> fs.FSInputChecker: DFSClient readChunk got seqno 593 offsetInBlock >> 38862848 >>> lastPacketInBlock false packetLen 66052 >>> attempt_201309061315_0005_000000_0: 13/09/06 14:01:39 DEBUG >>> fs.FSInputChecker: DFSClient readChunk got seqno 594 offsetInBlock >> 38928384 >>> lastPacketInBlock false packetLen 66052 >>> attempt_201309061315_0005_000000_0: 13/09/06 14:01:40 DEBUG >>> fs.FSInputChecker: DFSClient readChunk got seqno 595 offsetInBlock >> 38993920 >>> lastPacketInBlock false packetLen 66052 >>> attempt_201309061315_0005_000000_0: 13/09/06 14:01:40 DEBUG >>> fs.FSInputChecker: DFSClient readC >>> *13/09/06 14:03:29 INFO bsp.BSPJobClient: Job failed.* >>> <<<<<<<<<<<< >>> >>> >>> *hama-ubuntu-bspmaster-ubuntu.log* >>>>>>>>>>>>>>> >>> 2013-09-06 14:03:21,422 DEBUG org.apache.hama.bsp.Counters: Adding >>> SUPERSTEP_SUM >>> 2013-09-06 14:03:23,423 DEBUG org.apache.hama.bsp.Counters: Adding >>> SUPERSTEP_SUM >>> 2013-09-06 14:03:25,424 DEBUG org.apache.hama.bsp.Counters: Adding >>> SUPERSTEP_SUM >>> *2013-09-06 14:03:25,425 INFO org.apache.hama.bsp.JobInProgress: Taskid >>> 'attempt_201309061315_0005_000000_0' has failed. >>> 2013-09-06 14:03:25,425 INFO org.apache.hama.bsp.TaskInProgress: Task >>> 'task_201309061315_0005_000000' has failed. >>> *2013-09-06 14:03:25,425 DEBUG org.apache.hama.bsp.JobInProgress: >> Removing >>> /tmp/hadoop-ubuntu/bsp/local/bspMaster/job_201309061315_0005.xml and >>> /tmp/hadoop-ubuntu/bsp/local/bspMaster/job_201309061315_0005.jar >> getJobFile >>> = >> hdfs://localhost:9000/tmp/hadoop-ubuntu*/bsp/system/submit_714o6m/job.xml >>> 2013-09-06 14:03:25,434 INFO org.apache.hama.bsp.JobInProgress: Job >> failed. >>> 2013-09-06 14:03:25,434 DEBUG org.apache.hama.bsp.JobInProgress: Removing >>> null and null getJobFile = >>> hdfs://localhost:9000/tmp/hadoop-ubuntu/bsp/system/submit_714o6m/job.xml >>> *<<<<<<<<<<<<< >>> >>> *hama-ubuntu-groom-ubuntu.log* >>>>>>>>>>>>>>> >>> 2013-09-06 14:03:14,660 DEBUG org.apache.hama.bsp.GroomServer: checking >>> task: attempt_201309061315_0005_000000_0 starttime =1378456254247 >> lastping >>> = 1378456334727 run state = RUNNING monitorPeriod = 10000 check = false >>> 2013-09-06 14:03:24,660 DEBUG org.apache.hama.bsp.GroomServer: checking >>> task: attempt_201309061315_0005_000000_0 starttime =1378456254247 >> lastping >>> = 1378456334727 run state = RUNNING monitorPeriod = 10000 check = true >>> 2013-09-06 14:03:24,660 INFO org.apache.hama.bsp.GroomServer: adding >> purge >>> task: attempt_201309061315_0005_000000_0 >>> 2013-09-06 14:03:24,660 DEBUG org.apache.hama.bsp.GroomServer: Got 1 >>> oblivious tasks >>> 2013-09-06 14:03:24,661 DEBUG org.apache.hama.bsp.GroomServer: Purging >> task >>> org.apache.hama.bsp.GroomServer$TaskInProgress@2e0cd499 >>> *2013-09-06 14:03:24,661 INFO org.apache.hama.bsp.GroomServer: About to >>> purge task: attempt_201309061315_0005_000000_0 >>> 2013-09-06 14:03:24,661 DEBUG org.apache.hama.bsp.GroomServer: Killing >>> process for attempt_201309061315_0005_000000_0 >>> 2013-09-06 14:03:25,436 DEBUG org.apache.hama.bsp.GroomServer: Got >> Response >>> from BSPMaster with 1 actions >>> 2013-09-06 14:03:25,437 INFO org.apache.hama.bsp.GroomServer: Kill 1 >> tasks. >>> *<<<<<<<<<<<<< >>> >>> *attempt_201309061315_0005_000000_0.log* >>>>>>>>>>>>>>> >>> 13/09/06 14:02:06 DEBUG ipc.RPC: Call: ping 2 >>> 13/09/06 14:02:07 DEBUG fs.FSInputChecker: DFSClient readChunk got seqno >>> 633 offsetInBlock 41484288 lastPacketInBlock false packetLen 66052 >>> 13/09/06 14:02:14 DEBUG bsp.BSPTask: Pinging at time 1378456334726 >>> 13/09/06 14:02:14 DEBUG ipc.Client: IPC Client (47) connection to >> localhost/ >>> 127.0.0.1:49551 from ubuntu sending #24 >>> 13/09/06 14:02:14 DEBUG ipc.Client: IPC Client (47) connection to >> localhost/ >>> 127.0.0.1:49551 from ubuntu got value #24 >>> 13/09/06 14:02:14 DEBUG ipc.RPC: Call: ping 2 >>> 13/09/06 14:02:37 DEBUG bsp.BSPTask: Pinging at time 1378456357688 >>> 13/09/06 14:02:37 DEBUG ipc.Client: The ping interval is60000ms. >>> 13/09/06 14:02:38 DEBUG ipc.Client: Use SIMPLE authentication for >> protocol >>> BSPPeerProtocol >>> 13/09/06 14:02:39 DEBUG ipc.Client: Connecting to localhost/ >> 127.0.0.1:49551 >>> 13/09/06 14:02:56 DEBUG ipc.Client: The ping interval is60000ms. >>> 13/09/06 14:02:56 DEBUG ipc.Client: Use SIMPLE authentication for >> protocol >>> ClientProtocol >>> 13/09/06 14:02:57 DEBUG ipc.Client: Connecting to localhost/ >> 127.0.0.1:9000 >>> 13/09/06 14:02:58 DEBUG ipc.Client: IPC Client (47) connection to >> localhost/ >>> 127.0.0.1:49551 from ubuntu: closed >>> 13/09/06 14:02:59 DEBUG ipc.Client: IPC Client (47) connection to >> localhost/ >>> 127.0.0.1:49551 from ubuntu: stopped, remaining connections 1 >>>>>>>>>>>>>>> >>> >>> Any idea why job is failing. No exceptions or failures in any logs even >>> when I put the logs in DEBUG mode. >>> >>> Thanks, >>> Mahesh Babu >> >>
