How long does it take to start the code locally in a single thread? Can you reuse the JVM so it only starts once per node per job? conf.setNumTasksToExecutePerJvm(-1)
Cheers, Tim On Sun, Jun 28, 2009 at 9:43 PM, Marcus Herou<marcus.he...@tailsweep.com> wrote: > Hi. > > Wonder how one should improve the startup times of a hadoop job. Some of my > jobs which have a lot of dependencies in terms of many jar files take a long > time to start in hadoop up to 2 minutes some times. > The data input amounts in these cases are neglible so it seems that Hadoop > have a really high setup cost, which I can live with but this seems to much. > > Let's say a job takes 10 minutes to complete then it is bad if it takes 2 > mins to set it up... 20-30 sec max would be a lot more reasonable. > > Hints ? > > //Marcus > > > -- > Marcus Herou CTO and co-founder Tailsweep AB > +46702561312 > marcus.he...@tailsweep.com > http://www.tailsweep.com/ >