Is the lookup table constant across each of the tasks? You could try putting it into memcached:
http://hcil.cs.umd.edu/trs/2009-01/2009-01.pdf Matt -----Original Message----- From: Ian Upright [mailto:i...@upright.net] Sent: Wednesday, June 15, 2011 3:42 PM To: common-user@hadoop.apache.org Subject: large memory tasks Hello, I'm quite new to Hadoop, so I'd like to get an understanding of something. Lets say I have a task that requires 16gb of memory, in order to execute. Lets say hypothetically it's some sort of big lookuptable of sorts that needs that kind of memory. I could have 8 cores run the task in parallel (multithreaded), and all 8 cores can share that 16gb lookup table. On another machine, I could have 4 cores run the same task, and they still share that same 16gb lookup table. Now, with my understanding of Hadoop, each task has it's own memory. So if I have 4 tasks that run on one machine, and 8 tasks on another, then the 4 tasks need a 64 GB machine, and the 8 tasks need a 128 GB machine, but really, lets say I only have two machines, one with 4 cores and one with 8, each machine only having 24 GB. How can the work be evenly distributed among these machines? Am I missing something? What other ways can this be configured such that this works properly? Thanks, Ian This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware". Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations.