On Thu, May 7, 2009 at 6:05 AM, Foss User foss...@gmail.com wrote:
Thanks for your response again. I could not understand a few things in
your reply. So, I want to clarify them. Please find my questions
inline.
On Thu, May 7, 2009 at 2:28 AM, Todd Lipcon t...@cloudera.com wrote:
On Wed, May
1. Do the reducers of a job start only after all mappers have finished?
2. Say there are 10 slave nodes. Let us say one of the nodes is very
slow as compared to other nodes. So, while the mappers in the other 9
have finished in 2 minutes, the one on the slow one might take 20
minutes. Is Hadoop
On Wed, May 6, 2009 at 12:22 PM, Foss User foss...@gmail.com wrote:
1. Do the reducers of a job start only after all mappers have finished?
The reducer tasks start so they can begin copying map output, but your
actual reduce function does not. This is because it doesn't know that the
data for
Thanks for your response. I got a few more questions regarding optimizations.
1. Does hadoop clients locally cache the data it last requested?
2. Is the meta data for file blocks on data node kept in the
underlying OS's file system on namenode or is it kept in RAM of the
name node?
3
Thanks for your response again. I could not understand a few things in
your reply. So, I want to clarify them. Please find my questions
inline.
On Thu, May 7, 2009 at 2:28 AM, Todd Lipcon t...@cloudera.com wrote:
On Wed, May 6, 2009 at 1:46 PM, Foss User foss...@gmail.com wrote:
2. Is the meta
Optimizations
Right now I have a job whose reducer phase outputs the key-value pairs as
records into a database. Is this the best way to be loading the
database? What
are some alternatives?
Recently, most DBMS have a bulk insert mechanism instead of each
transaction. Check it.
-Edward
On Thu, Aug 28, 2008 at 6:27 AM, Yih Sun Khoo [EMAIL PROTECTED] wrote:
Optimizations
Right now I have a job whose reducer phase outputs the key-value pairs as
records into a database