Re: streaming job in python that reports progress

2011-01-28 Thread Harsh J
Already answered in the Streaming docs: http://hadoop.apache.org/mapreduce/docs/current/streaming.html#How+do+I+update+status+in+streaming+applications%3F On Sat, Jan 29, 2011 at 5:21 AM, felix gao wrote: > mighty user group, > I am trying to write a streaming job that does a lot of io in a pytho

Re: How to get file configuration from mapper

2011-01-28 Thread li ping
DistributedCache could help you. http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#DistributedCache On Fri, Jan 28, 2011 at 6:47 PM, Joan wrote: > Hi > > I'm trying to access to my custom configuration file (myconfig.xml) from > MyMapper. > > So I'm doing: > > *File configuratio

Re: Hadoop Version

2011-01-28 Thread hadoop user
Redirecting to common-user, you can check hadoop version by using any of the following methods. CLI : using hadoop version command. bin/hadoop version Web Interface: Check Name node or Job tracker web interface. It will show version number. - Ravi On Fri, Jan 28, 2011 at 11:24 AM, wrote: > H

streaming job in python that reports progress

2011-01-28 Thread felix gao
mighty user group, I am trying to write a streaming job that does a lot of io in a python program. I know if I don't report back every x minutes the job will be terminated. How do I report back to the task tracker in my streaming python job that is in the middle of the gzip for example. Thanks,

Re: Dump question about initializing log4j...

2011-01-28 Thread Shrijeet Paliwal
Jon, As defined in http://hadoop.apache.org/mailing_lists.html , please send user questions to mapreduce-user@hadoop.apache.org. Do you have log4j.properties (the hadoop one, found in conf directory) in your class path? On Fri, Jan 28, 2011 at 1:48 PM, Jonathan Coveney wrote: > I am trying to r

Hadoop Version

2011-01-28 Thread praveen.peddi
Hello all, I am having issues with accessing hdfs and I figured its due to version mismatch. I know my jar files have multiple copies of hadoop (pig has its own, I have hadoop 0.20.2 and Whirr had its own hadoop copy). My question how to find the right version of hadoop that matches with the one

Re: Draining/Decommisioning a tasktracker

2011-01-28 Thread Koji Noguchi
Hi Rishi, https://issues.apache.org/jira/browse/HADOOP-5643 is added to version 0.21. Some 0.20 branches may have this as well. However, even with this feature, I believe TaskTracker immediately kills itself when all the tasks on the TaskTrackers finish but doesn't take care of the jobs still

Re: java.lang.RuntimeException: problem advancing post rec#499959

2011-01-28 Thread Harsh J
This looks like a case of data corruption (of one/more intermediate files, a.k.a. IFiles). On Fri, Jan 28, 2011 at 11:21 PM, Pedro Costa wrote: > Hi, > > I'm running the Terasort problem in cluster mode, and I've got a > RunTimeException in a Reduce Task. > > java.lang.RuntimeException: problem a

java.lang.RuntimeException: problem advancing post rec#499959

2011-01-28 Thread Pedro Costa
Hi, I'm running the Terasort problem in cluster mode, and I've got a RunTimeException in a Reduce Task. java.lang.RuntimeException: problem advancing post rec#499959 (Please, see attachment) What this error means? Is it a problem about wrong KEYOUT and VALUEOUT in the Reduce Task? Thanks, --

Re: Draining/Decommisioning a tasktracker

2011-01-28 Thread Harsh J
Moving discussion to the MapReduce-User list: mapreduce-user@hadoop.apache.org Reply inline: On Fri, Jan 28, 2011 at 2:39 PM, rishi pathak wrote: > Hi, >        Is there a way to drain a tasktracker. What we require is not to > schedule any more map/red tasks onto a tasktracker(mark it offline)

How to get file configuration from mapper

2011-01-28 Thread Joan
Hi I'm trying to access to my custom configuration file (myconfig.xml) from MyMapper. So I'm doing: *File configurationFile = new File("./conf/", "myconfig.xml");* But when I see the absolute path from configuration file I get: * /tmp/hadoop-user/mapred/local/taskTracker/user/jobcache/job_2011

Fwd: Draining/Decommisioning a tasktracker

2011-01-28 Thread rishi pathak
Hi, Is there a way to drain a tasktracker. What we require is not to schedule any more map/red tasks onto a tasktracker(mark it offline) but still the running tasks should not be affected. -- --- Rishi Pathak National PARAM Supercomputing Facility C-DAC, Pune, India -- --- Rishi P

RE: Hi

2011-01-28 Thread mahesh
Hello Rahul Thanx for ur reply. Keep replying. I have already started. I have studied the IEEE paper "The Hadoop Distributed file system by Konstantin,Radia". What else material should I read where I can find more practical knowledge & also the area where research is required. How do I start for P

Re: Hi

2011-01-28 Thread rahul patodi
Hi Mahesh, I am currently working on a project related to Hadoop. You can ask for any help in your research. Have you started some thing in Hadoop? On Fri, Jan 28, 2011 at 12:20 PM, mahesh wrote: > Hi Rahul > > I am looking for research in HDFS. Can u help me out. Pl write me. > > > > Regards