Re: Accessing stderr with Hadoop Streaming

2009-06-23 Thread Mayuran Yogarajah
S D wrote: Is there a way to access stderr when using Hadoop Streaming? I see how stdout is written to the log files but I'm more concerned about what happens when errors occur. Access to stderr would help debug when a run doesn't complete successfully but I haven't been able to figure out how

Re: Every time the mapping phase finishes I see this

2009-06-08 Thread Mayuran Yogarajah
I should mention..these are Hadoop streaming jobs, Hadoop version hadoop-0.18.3. Any idea about the empty stdout/stderr/syslog logs? I have no way to really track down whats causing them. thanks Steve Loughran wrote: Mayuran Yogarajah wrote: There are always a few 'Failed/Killed Task

Every time the mapping phase finishes I see this

2009-06-06 Thread Mayuran Yogarajah
There are always a few 'Failed/Killed Task Attempts' and when I view the logs for these I see: - some that are empty, ie stdout/stderr/syslog logs are all blank - several that say: 2009-06-06 20:47:15,309 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.io.IOException:

Logging in Hadoop Stream jobs

2009-05-08 Thread Mayuran Yogarajah
How do people handle logging in a Hadoop stream job? I'm currently looking at using syslog for this but would like to know of other ways people are doing this currently. thanks

java.io.IOException: All datanodes are bad. Aborting...

2009-05-06 Thread Mayuran Yogarajah
I have 2 directories listed for dfs.data.dir and one of them got to 100% used during a job I ran. I suspect thats the reason I see this error in the logs. Can someone please confirm this? thanks

Re: Sequence of Streaming Jobs

2009-05-02 Thread Mayuran Yogarajah
Billy Pearson wrote: I done this with and array of commands for the jobs in a php script checking the return of the job to tell if it failed or not. Billy I have this same issue.. How do you check if a job failed or not? You mentioned checking the return code? How are you doing that ?

Re: Master crashed

2009-04-30 Thread Mayuran Yogarajah
Alex Loddengaard wrote: I'm confused. Why are you trying to stop things when you're bringing the name node back up? Try running start-all.sh instead. Alex Won't that try to start the daemons on the slave nodes again? They're already running. M On Tue, Apr 28, 2009 at 4:00 PM, Mayuran

Master crashed

2009-04-28 Thread Mayuran Yogarajah
The master in my cluster crashed, the dfs/mapred java processes are still running on the slaves. What should I do next? I brought the master back up and ran stop-mapred.sh and stop-dfs.sh and it said this: slave1.test.com: no tasktracker to stop slave1.test.com: no datanode to stop Not sure

Checking if a streaming job failed

2009-04-02 Thread Mayuran Yogarajah
Hello, does anyone know how I can check if a streaming job (in Perl) has failed or succeeded? The only way I can see at the moment is to check the web interface for that jobID and parse out the '*Status:*' value. Is it not possible to do this using 'hadoop job -status' ? I see there is a count

Hadoop Upgrade Wiki

2009-03-13 Thread Mayuran Yogarajah
Step 8 of the upgrade process mentions copying the 'edits' and 'fsimage' file to a backup directory. After step 19 it says: 'In case of failure the administrator should have the checkpoint files in order to be able to repeat the procedure from the appropriate point or to restart the old

Re: HDFS is corrupt, need to salvage the data.

2009-03-10 Thread Mayuran Yogarajah
lohit wrote: How many Datanodes do you have. From the output it looks like at the point when you ran fsck, you had only one datanode connected to your NameNode. Did you have others? Also, I see that your default replication is set to 1. Can you check if your datanodes are up and running.

HDFS is corrupt, need to salvage the data.

2009-03-09 Thread Mayuran Yogarajah
Hello, it seems the HDFS in my cluster is corrupt. This is the output from hadoop fsck: Total size:9196815693 B Total dirs:17 Total files: 157 Total blocks: 157 (avg. block size 58578443 B) CORRUPT FILES:157 MISSING BLOCKS: 157