Hadoop streaming - Subprocess failed

2012-08-29 Thread Periya.Data
Hi, I am running a map-reduce job in Python and I get this error message. I do not understand what it means. Output is not written to HDFS. I am using CDH3u3. Any suggestion is appreciated. MapAttempt TASK_TYPE=MAP TASKID=task_201208232245_2812_m_00

RE: Metrics ..

2012-08-29 Thread Wong, David (DMITS)
Here's a snippet of tasktracker metrics using Metrics2. (I think there were (more) gaps in the pre-metrics2 versions.) Note that you'll need to have hadoop-env.sh and hadoop-metrics2.properties setup on all the nodes you want reports from. 1345570905436 ugi.ugi: context=ugi,

Re: Metrics ..

2012-08-29 Thread Mark Olimpiati
Hi David, I enabled the jvm.class of the hadoop-metrics.properties, you're output seems to be from something else (dfs.class or mapred.class) which reports hadoop deamons performace. For example your output shows processName=TaskTracker which I'm not looking for. How can I report jvm

no output written to HDFS

2012-08-29 Thread Periya.Data
Hi All, My Hadoop streaming job (in Python) runs to completion (both map and reduce says 100% complete). But, when I look at the output directory in HDFS, the part files are empty. I do not know what might be causing this behavior. I understand that the percentages represent the records that

Re: no output written to HDFS

2012-08-29 Thread Bertrand Dechoux
Do you observe the same thing when running without Hadoop? (cat, map, sort and then reduce) Could you provide the counters of your job? You should be able to get them using the job tracker interface. The most probable answer without more information would be that your reducer do not output any