I think we should also add an available RAM variable and then do a
reasonable job of deriving a bunch of the other variables in these
settings from that (we may need one for task trackers, one for
namenodes and so on.
A lot of the memory related default settings make no sense on the
boxes we use.
What RAM size should we assume is a reasonable default?
2GB? 1GB?
We are currently standardizing on 8.
On Sep 7, 2007, at 7:41 AM, Enis Soztutar wrote:
Hadoop has been used in quite varying cluster sizes (in the range
1-2000), so am strongly in favor of as much automatic configuration as
possible.
Doug Cutting wrote:
> Raghu Angadi wrote:
>> Right now Namenode does not know about the cluster size before
>> starting IPC server.
>
> Sounds like perhaps we should make the handler count, queue size,
etc.
> dynamically adjustable, e.g., by adding Server methods for
> setHandlerCount(), setQueueSize(), etc. There's been talk of trying
> to automatically adjust these within Server.java, based on load, and
> that would be better yet, but short of that, we might adjust them
> heuristically based on cluster size.
>
> The urgent thing, since we expect the best settings for large
clusters
> to change, is to make it so that folks don't need to adjust these
> manually, even if the automation is an ill-understood heuristic. I
> think we can easily get some workable heuristics into 0.15, but we
> might not get be able to implement async responses or figure out how
> to adjust it automatically in Server.java or whatever in that
> timeframe. Perhaps we should just change the defaults to be big
> enough for 2000 nodes, but that seems like too big of a hammer.
>
> Doug
>