Thank you, Jason. I found the example. So, is there a way to share the same JVM between different jobs?
________________________________ From: jason hadoop <jason.had...@gmail.com> To: core-user@hadoop.apache.org Sent: Tuesday, June 16, 2009 7:22:16 PM Subject: Re: Can I share datas for several map tasks? in the example code, download bundle, in the package com.apress.hadoopbook.examples.advancedtechniques, is the class JVMReuseAndStaticInitializers.java which demonstrates sharing data between instances using jvm reuse. I built this to prove to myself that it was possible. It never got an actual write up in the book itself. On Tue, Jun 16, 2009 at 6:55 PM, Hello World <snowlo...@gmail.com> wrote: > I can't get your book, so can you give me a few more words to describe the > solution? very appreciate. > > -snowloong > > On Tue, Jun 16, 2009 at 9:51 PM, jason hadoop <jason.had...@gmail.com > >wrote: > > > In the examples for my book is a jvm reuse with static data shared > between > > jvm's example > > > > On Tue, Jun 16, 2009 at 1:08 AM, Hello World <snowlo...@gmail.com> > wrote: > > > > > Thanks for your reply. Can you do me a favor to make a check? > > > I modified mapred-default.xml as follows: > > > 540 <property> > > > 541 <name>mapred.job.reuse.jvm.num.tasks</name> > > > 542 <value>-1</value> > > > 543 <description>How many tasks to run per jvm. If set to -1, > there > > is > > > 544 no limit. > > > 545 </description> > > > 546 </property> > > > And execute bin/stop-all.sh; bin/start-all.sh to restart hadoop; > > > > > > This is my program: > > > > > > 17 public class WordCount { > > > 18 > > > 19 public static class TokenizerMapper > > > 20 extends Mapper<Object, Text, Text, IntWritable>{ > > > 21 > > > 22 private final static IntWritable one = new IntWritable(1); > > > 23 private Text word = new Text(); > > > 24 public static int[] ToBeSharedData = new int[1024 * 1024 * > > 16]; > > > 25 > > > 26 protected void setup(Context context > > > 27 ) throws IOException, InterruptedException { > > > 28 //Init shared data > > > 29 ToBeSharedData[0] = 12345; > > > 30 System.out.println("setup shared data[0] = " + > > > ToBeSharedData[0]); > > > 31 } > > > 32 > > > 33 public void map(Object key, Text value, Context context > > > 34 ) throws IOException, InterruptedException { > > > 35 StringTokenizer itr = new > StringTokenizer(value.toString()); > > > 36 while (itr.hasMoreTokens()) { > > > 37 word.set(itr.nextToken()); > > > 38 context.write(word, one); > > > 39 } > > > 40 System.out.println("read shared data[0] = " + > > > ToBeSharedData[0]); > > > 41 } > > > 42 } > > > > > > First, can you tell me how to make sure "jvm reuse" is taking effect, > for > > I > > > didn't see anything different from before. I use "top" command under > > linux > > > and see the same number of java processes and same memory usage. > > > > > > Second, can you tell me how to make the "ToBeSharedData" be inited only > > > once > > > and can be read from other MapTasks on the same node? Or this is not a > > > suitable programming style for map-reduce? > > > > > > By the way, I'm using hadoop-0.20.0, in pseudo-distributed mode on a > > > single-node. > > > thanks in advance > > > > > > On Tue, Jun 16, 2009 at 1:48 PM, Sharad Agarwal < > shara...@yahoo-inc.com > > > >wrote: > > > > > > > > > > > snowloong wrote: > > > > > Hi, > > > > > I want to share some data structures for the map tasks on a same > > > node(not > > > > through files), I mean, if one map task has already initialized some > > data > > > > structures (e.g. an array or a list), can other map tasks share these > > > > memorys and directly access them, for I don't want to reinitialize > > these > > > > datas and I want to save some memory. Can hadoop help me do this? > > > > > > > > You can enable jvm reuse across tasks. See > > mapred.job.reuse.jvm.num.tasks > > > > in mapred-default.xml for usage. Then you can cache the data in a > > static > > > > variable in your mapper. > > > > > > > > - Sharad > > > > > > > > > > > > > > > -- > > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > > http://www.apress.com/book/view/9781430219422 > > www.prohadoopbook.com a community for Hadoop Professionals > > > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals