@ Harsh - Yeah, mapred.task.timeout is the valid option. but for some reasons, its not happening the way it should be.. I am not sure what could be the cause.Thing is my jobs are running fine, its just that they are slow at shuffling phase, sometimes.. not everytime.. so I was thinking "as an admin - can we control the running of jobs, just as a test, where we can just kill the jobs who are taking more time for execution -- not only those jobs that are hanging..but jobs that are taking more execution time than expected". Problems in my case is, end-user doesn't want to go through the pain of managing/controlling jobs over hadoop. They want all these job handling should happen automatically, so that made me to think in such a way (which I know is not the best way)
Anyways, going away from the topic -- Is there anyway through which I can improve my shuffling (through any configuration parameters only, knowing the fact that users doesn't know the idea of minimizing the key/value pairs) Thanks, Praveenesh On Mon, Jan 30, 2012 at 1:06 PM, Masoud <mas...@agape.hanyang.ac.kr> wrote: > Hi, > > Every Map/Reduce app has a Reporter, You can set the configuration > parameter {mapred.task.timeout} of Reporter to your desired value. > > Good Luck. > > > On 01/30/2012 04:14 PM, praveenesh kumar wrote: > >> Yeah, I am aware of that, but it needs you to explicity monitor the job >> and >> look for jobid and then hadoop job -kill command. >> What I want to know - "Is there anyway to do all this automatically by >> providing some timer or something -- that if my job is taking more than >> some predefined time, it would get killed automatically >> >> Thanks, >> Praveenesh >> >> On Mon, Jan 30, 2012 at 12:38 PM, Prashant Kommireddi >> <prash1...@gmail.com>wrote: >> >> You might want to take a look at the kill command : "hadoop job -kill >>> <jobid>". >>> >>> Prashant >>> >>> On Sun, Jan 29, 2012 at 11:06 PM, praveenesh kumar<praveen...@gmail.com >>> >>>> wrote: >>>> Is there anyway through which we can kill hadoop jobs that are taking >>>> enough time to execute ? >>>> >>>> What I want to achieve is - If some job is running more than >>>> "_some_predefined_timeout_**limit", it should be killed automatically. >>>> >>>> Is it possible to achieve this, through shell scripts or any other way ? >>>> >>>> Thanks, >>>> Praveenesh >>>> >>>> >