I couldn't agree more. There are quite a portion of questions that are relating to configuration more or less. though there are pages explaining how (which is important), it would make things even easier if there were more various examples.
Bwolen Yang <[EMAIL PROTECTED]> wrote: Hi, As a newbie to Hadoop, I have being wondering what's the best way to configure my cluster, especially as one scales up. After seeing Doug's update to sort 900 performance, it occured to me that it may be helpful to others to see configuration files examples, espeically for large clusters. Furthermore, if we can diff against the configurations over time (and/or releases), we may be able to see how Hadoop developers tune their own clusters (and hence follow suit :). Could the configs along with rough cluster specs be posted somewhere on hadoop's website? And perhaps encourage others (with different system setups) to post similarly? I'm also interested in seeing how people tune their clusters for different kind of machines (e.g, single disk machines vs 4-6 disk machines), and hetergenous systems (different CPU power, disk size, memory size...etc). The hetergenous part arises for people who are resource strapped and basically tried hard to put together a sizeable system with whatever machines they have got. In my case, bad config can hurt as I add news machines (e.g., a machine with small disk, fills up quicker and task scheduled there tend to die). thanks bwolen Best Regards Richard Yang [EMAIL PROTECTED] [EMAIL PROTECTED]
