It really depends on what type of data you are sharing, how you are looking up the data, whether the data is Read-write, and whether you care about consistency. If you don't care about consistency, I suggest that you shove the data into a BDB store (for key-value lookup) or a lucene store, and copy the data to all the nodes. That way all data access will be in-process, no gc problems, and you will get very fast results. BDB and lucene both have easy replication strategies.
If the data is RW, and you need consistency, you should probably forget about MapReduce and just run everything on big-iron. Regards, Alan Ho ----- Original Message ---- From: Devajyoti Sarkar <[EMAIL PROTECTED]> To: core-user@hadoop.apache.org Sent: Thursday, October 2, 2008 8:41:04 PM Subject: Sharing an object across mappers I think each mapper/reducer runs in its own JVM which makes it impossible to share objects. I need to share a large object so that I can access it at memory speeds across all the mappers. Is it possible to have all the mappers run in the same VM? Or is there a way to do this across VMs at high speed? I guess JMI and others such methods will be just too slow. Thanks, Dev __________________________________________________________________ Instant Messaging, free SMS, sharing photos and more... Try the new Yahoo! Canada Messenger at http://ca.beta.messenger.yahoo.com/