S D wrote:
Is there a way to access stderr when using Hadoop Streaming? I see how
stdout is written to the log files but I'm more concerned about what happens
when errors occur. Access to stderr would help debug when a run doesn't
complete successfully but I haven't been able to figure out how
I should mention..these are Hadoop streaming jobs, Hadoop version
hadoop-0.18.3.
Any idea about the empty stdout/stderr/syslog logs? I have no way to
really track down
whats causing them.
thanks
Steve Loughran wrote:
Mayuran Yogarajah wrote:
There are always a few 'Failed/Killed Task
There are always a few 'Failed/Killed Task Attempts' and when I view the
logs for
these I see:
- some that are empty, ie stdout/stderr/syslog logs are all blank
- several that say:
2009-06-06 20:47:15,309 WARN org.apache.hadoop.mapred.TaskTracker: Error
running child
java.io.IOException:
How do people handle logging in a Hadoop stream job?
I'm currently looking at using syslog for this but would like to know of
other ways
people are doing this currently.
thanks
I have 2 directories listed for dfs.data.dir and one of them got to 100%
used
during a job I ran. I suspect thats the reason I see this error in the
logs.
Can someone please confirm this?
thanks
Billy Pearson wrote:
I done this with and array of commands for the jobs in a php script checking
the return of the job to tell if it failed or not.
Billy
I have this same issue.. How do you check if a job failed or not? You
mentioned checking
the return code? How are you doing that ?
Alex Loddengaard wrote:
I'm confused. Why are you trying to stop things when you're bringing the
name node back up? Try running start-all.sh instead.
Alex
Won't that try to start the daemons on the slave nodes again? They're
already running.
M
On Tue, Apr 28, 2009 at 4:00 PM, Mayuran
The master in my cluster crashed, the dfs/mapred java processes are
still running on the slaves. What should I do next? I brought the master
back up and ran stop-mapred.sh and stop-dfs.sh and it said this:
slave1.test.com: no tasktracker to stop
slave1.test.com: no datanode to stop
Not sure
Hello, does anyone know how I can check if a streaming job (in Perl) has
failed or succeeded? The only way I can see at the moment is to check
the web interface for that jobID and parse out the '*Status:*' value.
Is it not possible to do this using 'hadoop job -status' ? I see there
is a count
Step 8 of the upgrade process mentions copying the 'edits' and 'fsimage'
file
to a backup directory. After step 19 it says:
'In case of failure the administrator should have the checkpoint files
in order to be able to repeat the procedure from the appropriate point
or to restart the old
lohit wrote:
How many Datanodes do you have.
From the output it looks like at the point when you ran fsck, you had only one
datanode connected to your NameNode. Did you have others?
Also, I see that your default replication is set to 1. Can you check if your
datanodes are up and running.
Hello, it seems the HDFS in my cluster is corrupt. This is the output
from hadoop fsck:
Total size:9196815693 B
Total dirs:17
Total files: 157
Total blocks: 157 (avg. block size 58578443 B)
CORRUPT FILES:157
MISSING BLOCKS: 157
12 matches
Mail list logo