Re: About Hadoop optimizations

2009-05-07 Thread Tom White
On Thu, May 7, 2009 at 6:05 AM, Foss User foss...@gmail.com wrote: Thanks for your response again. I could not understand a few things in your reply. So, I want to clarify them. Please find my questions inline. On Thu, May 7, 2009 at 2:28 AM, Todd Lipcon t...@cloudera.com wrote: On Wed, May

About Hadoop optimizations

2009-05-06 Thread Foss User
1. Do the reducers of a job start only after all mappers have finished? 2. Say there are 10 slave nodes. Let us say one of the nodes is very slow as compared to other nodes. So, while the mappers in the other 9 have finished in 2 minutes, the one on the slow one might take 20 minutes. Is Hadoop

Re: About Hadoop optimizations

2009-05-06 Thread Todd Lipcon
On Wed, May 6, 2009 at 12:22 PM, Foss User foss...@gmail.com wrote: 1. Do the reducers of a job start only after all mappers have finished? The reducer tasks start so they can begin copying map output, but your actual reduce function does not. This is because it doesn't know that the data for

Re: About Hadoop optimizations

2009-05-06 Thread Foss User
Thanks for your response. I got a few more questions regarding optimizations. 1. Does hadoop clients locally cache the data it last requested? 2. Is the meta data for file blocks on data node kept in the underlying OS's file system on namenode or is it kept in RAM of the name node? 3

Re: About Hadoop optimizations

2009-05-06 Thread Foss User
Thanks for your response again. I could not understand a few things in your reply. So, I want to clarify them. Please find my questions inline. On Thu, May 7, 2009 at 2:28 AM, Todd Lipcon t...@cloudera.com wrote: On Wed, May 6, 2009 at 1:46 PM, Foss User foss...@gmail.com wrote: 2. Is the meta

Optimizations

2008-08-27 Thread Yih Sun Khoo
Optimizations Right now I have a job whose reducer phase outputs the key-value pairs as records into a database. Is this the best way to be loading the database? What are some alternatives?

Re: Optimizations

2008-08-27 Thread Edward J. Yoon
Recently, most DBMS have a bulk insert mechanism instead of each transaction. Check it. -Edward On Thu, Aug 28, 2008 at 6:27 AM, Yih Sun Khoo [EMAIL PROTECTED] wrote: Optimizations Right now I have a job whose reducer phase outputs the key-value pairs as records into a database